IT技术精华 http://it.taocms.org/ 聚合国内IT技术精华文章,分享IT技术精华,帮助IT从业人士成长 2019-08-24 IT技术精华 14625 朴素贝叶斯python代码实现(西瓜书)

朴素贝叶斯python代码实现(西瓜书)

摘要:

朴素贝叶斯也是机器学习中一种非常常见的分类方法,对于二分类问题,并且数据集特征为离散型属性的时候,
使用起来非常的方便。原理简单,训练效率高,拟合效果好。

朴素贝叶斯

贝叶斯公式:

朴素贝叶斯之所以称这为朴素,是因为假设了各个特征是相互独立的,因此假定下公式成立:

则朴素贝叶斯算法的计算公式如下:

在实际计算中,上面的公式会做如下略微改动:

  1. 由于某些特征属性的值P(Xi|Ci)可能很小,多个特征的p值连乘后可能被约等于0。可以公式两边取log然后变乘法为加法,避免类乘问题。
  2. P(Ci) 和P(Xi|Ci) 一般不直接使用样本的频率计算出来,一般会使用拉普拉斯平滑。

上面公式中,Dc为该类别的频数,N表示所有类别的可能数。

上面公式中,Dc,xi为该特征对应属性的频数,Dc为该类别的频数,Ni表示该特征的可能的属性数。

对应的西瓜书数据集为

色泽  根蒂  敲声  纹理  脐部  触感  好瓜
青绿 蜷缩 浊响 清晰 凹陷 硬滑 是
乌黑 蜷缩 沉闷 清晰 凹陷 硬滑 是
乌黑 蜷缩 浊响 清晰 凹陷 硬滑 是
青绿 蜷缩 沉闷 清晰 凹陷 硬滑 是
浅白 蜷缩 浊响 清晰 凹陷 硬滑 是
青绿 稍蜷 浊响 清晰 稍凹 软粘 是
乌黑 稍蜷 浊响 稍糊 稍凹 软粘 是
乌黑 稍蜷 浊响 清晰 稍凹 硬滑 是
乌黑 稍蜷 沉闷 稍糊 稍凹 硬滑 否
青绿 硬挺 清脆 清晰 平坦 软粘 否
浅白 硬挺 清脆 模糊 平坦 硬滑 否
浅白 蜷缩 浊响 模糊 平坦 软粘 否
青绿 稍蜷 浊响 稍糊 凹陷 硬滑 否
浅白 稍蜷 沉闷 稍糊 凹陷 硬滑 否
乌黑 稍蜷 浊响 清晰 稍凹 软粘 否
浅白 蜷缩 浊响 模糊 平坦 硬滑 否
青绿 蜷缩 沉闷 稍糊 稍凹 硬滑 否

python实现

#encoding:utf-8

import pandas as pd
import numpy as np

class NaiveBayes:
def __init__(self):
self.model = {}#key 为类别名 val 为字典PClass表示该类的该类,PFeature:{}对应对于各个特征的概率
def calEntropy(self, y): # 计算熵
valRate = y.value_counts().apply(lambda x : x / y.size) # 频次汇总 得到各个特征对应的概率
valEntropy = np.inner(valRate, np.log2(valRate)) * -1
return valEntropy

def fit(self, xTrain, yTrain = pd.Series()):
if not yTrain.empty:#如果不传,自动选择最后一列作为分类标签
xTrain = pd.concat([xTrain, yTrain], axis=1)
self.model = self.buildNaiveBayes(xTrain)
return self.model
def buildNaiveBayes(self, xTrain):
yTrain = xTrain.iloc[:,-1]

yTrainCounts = yTrain.value_counts()# 频次汇总 得到各个特征对应的概率

yTrainCounts = yTrainCounts.apply(lambda x : (x + 1) / (yTrain.size + yTrainCounts.size)) #使用了拉普拉斯平滑
retModel = {}
for nameClass, val in yTrainCounts.items():
retModel[nameClass] = {'PClass': val, 'PFeature':{}}

propNamesAll = xTrain.columns[:-1]
allPropByFeature = {}
for nameFeature in propNamesAll:
allPropByFeature[nameFeature] = list(xTrain[nameFeature].value_counts().index)
#print(allPropByFeature)
for nameClass, group in xTrain.groupby(xTrain.columns[-1]):
for nameFeature in propNamesAll:
eachClassPFeature = {}
propDatas = group[nameFeature]
propClassSummary = propDatas.value_counts()# 频次汇总 得到各个特征对应的概率
for propName in allPropByFeature[nameFeature]:
if not propClassSummary.get(propName):
propClassSummary[propName] = 0#如果有属性灭有,那么自动补0
Ni = len(allPropByFeature[nameFeature])
propClassSummary = propClassSummary.apply(lambda x : (x + 1) / (propDatas.size + Ni))#使用了拉普拉斯平滑
for nameFeatureProp, valP in propClassSummary.items():
eachClassPFeature[nameFeatureProp] = valP
retModel[nameClass]['PFeature'][nameFeature] = eachClassPFeature

return retModel
def predictBySeries(self, data):
curMaxRate = None
curClassSelect = None
for nameClass, infoModel in self.model.items():
rate = 0
rate += np.log(infoModel['PClass'])
PFeature = infoModel['PFeature']

for nameFeature, val in data.items():
propsRate = PFeature.get(nameFeature)
if not propsRate:
continue
rate += np.log(propsRate.get(val, 0))#使用log加法避免很小的小数连续乘,接近零
#print(nameFeature, val, propsRate.get(val, 0))
#print(nameClass, rate)
if curMaxRate == None or rate > curMaxRate:
curMaxRate = rate
curClassSelect = nameClass

return curClassSelect
def predict(self, data):
if isinstance(data, pd.Series):
return self.predictBySeries(data)
return data.apply(lambda d: self.predictBySeries(d), axis=1)

dataTrain = pd.read_csv("xiguadata.csv", encoding = "gbk")

naiveBayes = NaiveBayes()
treeData = naiveBayes.fit(dataTrain)

import json
print(json.dumps(treeData, ensure_ascii=False))

pd = pd.DataFrame({'预测值':naiveBayes.predict(dataTrain), '正取值':dataTrain.iloc[:,-1]})
print(pd)
print('正确率:%f%%'%(pd[pd['预测值'] == pd['正取值']].shape[0] * 100.0 / pd.shape[0]))

输出

{"否": {"PClass": 0.5263157894736842, "PFeature": {"色泽": {"浅白": 0.4166666666666667, "青绿": 0.3333333333333333, "乌 黑": 0.25}, "根蒂": {"稍蜷": 0.4166666666666667, "蜷缩": 0.3333333333333333, "硬挺": 0.25}, "敲声": {"浊响": 0.4166666666666667, "沉闷": 0.3333333333333333, "清脆": 0.25}, "纹理": {"稍糊": 0.4166666666666667, "模糊": 0.3333333333333333, "清晰": 0.25}, "脐部": {"平坦": 0.4166666666666667, "稍凹": 0.3333333333333333, "凹陷": 0.25}, "触感": {"硬滑": 0.6363636363636364, "软粘": 0.36363636363636365}}}, "是": {"PClass": 0.47368421052631576, "PFeature": {"色泽": {"乌黑": 0.45454545454545453, "青绿": 0.36363636363636365, "浅白": 0.18181818181818182}, "根蒂": {"蜷缩": 0.5454545454545454, "稍蜷": 0.36363636363636365, "硬挺": 0.09090909090909091}, "敲声": {"浊响": 0.6363636363636364, "沉闷": 0.2727272727272727, "清脆": 0.09090909090909091}, "纹理": {"清晰": 0.7272727272727273, "稍糊": 0.18181818181818182, "模糊": 0.09090909090909091}, "脐 部": {"凹陷": 0.5454545454545454, "稍凹": 0.36363636363636365, "平坦": 0.09090909090909091}, "触感": {"硬滑": 0.7, "软粘": 0.3}}}}
预测值 正取值
0 是 是
1 是 是
2 是 是
3 是 是
4 是 是
5 是 是
6 否 是
7 是 是
8 否 否
9 否 否
10 否 否
11 否 否
12 是 否
13 否 否
14 是 否
15 否 否
16 否 否
正确率:82.352941%

总结:

  • 贝叶斯分类器是一种生成式模型,不是直接拟合分类结果,而是拟合出后验概率公式计算对应分类的概率。
  • 本文只介绍了二分类,也可以用来处理多分类问题。
  • 对于小规模数据集,表现良好。
  • 建立在特征相互独立的假设上。
  • 这是我的github主页https://github.com/fanchy,有些有意思的分享。

立即购买:淘宝网]]>
IT技术精华 2019-08-23 23:08:07
14624 决策树ID3原理及R语言python代码实现(西瓜书)

决策树ID3原理及R语言python代码实现(西瓜书)

摘要:

决策树是机器学习中一种非常常见的分类与回归方法,可以认为是if-else结构的规则。分类决策树是由节点和有向边组成的树形结构,节点表示特征或者属性,
而边表示的是属性值,边指向的叶节点为对应的分类。在对样本的分类过程中,由顶向下,根据特征或属性值选择分支,递归遍历直到叶节点,将实例分到叶节点对应的类别中。
决策树的学习过程就是构造出一个能正取分类(或者误差最小)训练数据集的且有较好泛化能力的树,核心是如何选择特征或属性作为节点,
通常的算法是利用启发式的算法如ID3,C4.5,CART等递归的选择最优特征。选择一个最优特征,然后按照此特征将数据集分割成多个子集,子集再选择最优特征,
直到所有训练数据都被正取分类,这就构造出了决策树。决策树有如下特点:

  1. 原理简单, 计算高效;使用基于信息熵相关的理论划分最优特征,原理清晰,计算效率高。
  2. 解释性强;决策树的属性结构以及if-else的判断逻辑,非常符合人的决策思维,使用训练数据集构造出一个决策树后,可视化决策树,
    可以非常直观的理解决策树的判断逻辑,可读性强。
  3. 效果好,应用广泛;其拟合效果一般很好,分类速度快,但也容易过拟合,决策树拥有非常广泛的应用。

本文主要介绍基于ID3的算法构造决策树。

决策树原理

训练数据集有多个特征,如何递归选择最优特征呢?信息熵增益提供了一个非常好的也非常符合人们日常逻辑的判断准则,即信息熵增益最大的特征为最优特征。在信息论中,熵是用来度量随机变量不确定性的量纲,熵越大,不确定性越大。熵定义如下:

此处log一般是以2为底,假设一个产品成品率为100%次品率为0%那么熵就为0,如果是成品率次品率各为50%,那么熵就为1,熵越大,说明不确定性越高,非常符合我们人类的思维逻辑。假设分类标记为随机变量Y,那么H(Y)表示随机变量Y的不确定性,我们依次选择可选特征,如果选择一个特征后,随机变量Y的熵减少的最多,表示得知特征X后,使得类Y不确定性减少最多,那么就把此特征选为最优特征。信息熵增益的公式如下:

ID3算法

决策树基于信息熵增益的ID3算法步骤如下:

  1. 如果数据集类别只有一类,选择这个类别作为,标记为叶节点。
  2. 从数据集的所有特征中,选择信息熵增益最大的作为节点,特征的属性分别作为节点的边。
  3. 选择最优特征后,按照对应的属性,将数据集分成多个,依次将子数据集从第1步递归进行构造子树。

python实现

#encoding:utf-8

import pandas as pd
import numpy as np

class DecisionTree:
def __init__(self):
self.model = None
def calEntropy(self, y): # 计算熵
valRate = y.value_counts().apply(lambda x : x / y.size) # 频次汇总 得到各个特征对应的概率
valEntropy = np.inner(valRate, np.log2(valRate)) * -1
return valEntropy

def fit(self, xTrain, yTrain = pd.Series()):
if yTrain.size == 0:#如果不传,自动选择最后一列作为分类标签
yTrain = xTrain.iloc[:,-1]
xTrain = xTrain.iloc[:,:len(xTrain.columns)-1]
self.model = self.buildDecisionTree(xTrain, yTrain)
return self.model
def buildDecisionTree(self, xTrain, yTrain):
propNamesAll = xTrain.columns
#print(propNamesAll)
yTrainCounts = yTrain.value_counts()
if yTrainCounts.size == 1:
#print('only one class', yTrainCounts.index[0])
return yTrainCounts.index[0]
entropyD = self.calEntropy(yTrain)

maxGain = None
maxEntropyPropName = None
for propName in propNamesAll:
propDatas = xTrain[propName]
propClassSummary = propDatas.value_counts().apply(lambda x : x / propDatas.size)# 频次汇总 得到各个特征对应的概率

sumEntropyByProp = 0
for propClass, dvRate in propClassSummary.items():
yDataByPropClass = yTrain[xTrain[propName] == propClass]
entropyDv = self.calEntropy(yDataByPropClass)
sumEntropyByProp += entropyDv * dvRate
gainEach = entropyD - sumEntropyByProp
if maxGain == None or gainEach > maxGain:
maxGain = gainEach
maxEntropyPropName = propName
#print('select prop:', maxEntropyPropName, maxGain)
propDatas = xTrain[maxEntropyPropName]
propClassSummary = propDatas.value_counts().apply(lambda x : x / propDatas.size)# 频次汇总 得到各个特征对应的概率

retClassByProp = {}
for propClass, dvRate in propClassSummary.items():
whichIndex = xTrain[maxEntropyPropName] == propClass
if whichIndex.size == 0:
continue
xDataByPropClass = xTrain[whichIndex]
yDataByPropClass = yTrain[whichIndex]
del xDataByPropClass[maxEntropyPropName]#删除已经选择的属性列

#print(propClass)
#print(pd.concat([xDataByPropClass, yDataByPropClass], axis=1))

retClassByProp[propClass] = self.buildDecisionTree(xDataByPropClass, yDataByPropClass)

return {'Node':maxEntropyPropName, 'Edge':retClassByProp}
def predictBySeries(self, modelNode, data):
if not isinstance(modelNode, dict):
return modelNode
nodePropName = modelNode['Node']
prpVal = data.get(nodePropName)
for edge, nextNode in modelNode['Edge'].items():
if prpVal == edge:
return self.predictBySeries(nextNode, data)
return None
def predict(self, data):
if isinstance(data, pd.Series):
return self.predictBySeries(self.model, data)
return data.apply(lambda d: self.predictBySeries(self.model, d), axis=1)

dataTrain = pd.read_csv("xiguadata.csv", encoding = "gbk")

decisionTree = DecisionTree()
treeData = decisionTree.fit(dataTrain)
print(pd.DataFrame({'预测值':decisionTree.predict(dataTrain), '正取值':dataTrain.iloc[:,-1]}))

import json
print(json.dumps(treeData, ensure_ascii=False))

训练结束后,使用一个递归的字典保存决策树模型,使用格式json工具格式化输出后,可以简洁的看到树的结构。

R语言实现



dataTrain <- read.csv("xiguadata.csv", header = TRUE)

trainDecisionTree <- function(dataTrain){
calEntropy <- function(y){ # 计算熵

values <- table(unlist(y)); # 频次汇总 得到各个特征对应的概率

valuesRate <- values / sum(values);

logVal = log2(valuesRate);# log2(0) == infinite
logVal[is.infinite(logVal)]=0;

valuesEntropy <- -1 * t(valuesRate) %*% logVal;
if (is.nan(valuesEntropy)){
valuesEntropy = 0;
}
return(valuesEntropy);
}

propNamesAll <- names(dataTrain)
propNamesAll <- propNamesAll[length(propNamesAll) * - 1]
print(propNamesAll)
buildDecisionTree <- function(propNames, dataSet){


classColumn = dataSet[, length(dataSet)]#最后一列是类别标签

classSummary <- table(unlist(classColumn))# 频次汇总

defaultRet = c(propNames[1], names(classSummary)[which.max(classSummary)]);
if (length(classSummary) == 1){#如果所有的都是同一类别,那么标记为叶节点
return(defaultRet);
}
if (length(propNames) == 1){#如果只剩一种属性了,那么返回样本数量最多的类别作为节点
return(defaultRet);
}
entropyD <- calEntropy(classColumn)
propGains = sapply(propNames, function(propName){ # propName 对应的是"色泽" "根蒂" "敲声" "纹理" "脐部" "触感"
propDatas <- dataSet[c(propName)]

propClassSummary <- table(unlist(propDatas))# 频次汇总

retGain <- sapply(names(propClassSummary), function(propClass){# propClass 对应色泽的种类 如 浅白 青绿 乌黑
dataByPropClass <- subset(dataSet, dataSet[c(propName)] == propClass); #筛选出色泽等于 种类 propClass 的数据集
entropyDv <- calEntropy(dataByPropClass[, length(dataByPropClass)]) #最后一列是标记是否为好瓜
Dv = propClassSummary[c(propClass)][1]
return(entropyDv * Dv);# 这里没有直接除|D|,最后累加后再除,等价的
});

return(entropyD - sum(retGain)/sum(propClassSummary));
});
#print(propGains);
maxEntropyProp = propGains[which.max(propGains)];#选择信息熵增益最大的属性
propName = names(maxEntropyProp)[1]
#print(propName)
propDatas <- dataSet[c(propName)]

propClassSummary <- table(unlist(propDatas))# 频次汇总

propClassSummary <- propClassSummary[which(propClassSummary > 0)]
propClassNames <- names(propClassSummary)

#propClassNames = c(propClassNames[1])
retGain <- sapply(propClassNames, function(propClass){# propClass 对应色泽的种类 如 浅白 青绿 乌黑

dataByPropClass <- subset(dataSet, dataSet[c(propName)] == propClass); #筛选出色泽等于 种类 propClass 的数据集
leftClassNames = propNames[which(propNames==propName) * -1] #去掉这个属性,递归构造决策树
ret = buildDecisionTree(leftClassNames, dataByPropClass);
return(ret);
});
#names(retGain) = propClassNames
retList = retGain
#retList = list()
#for (propClass in propClassNames){
# retList[propClass] = retGain[propClass]
#}
#print(retList)

#索引1表示选择的属性名称 索引2对应的类别,如果有子树那么就是frame,否则就是类别
ret = list(propName, retList)
#ret = data.frame(c(retList))
#names(ret) = c(propName)
return(ret);
}
retProp = buildDecisionTree(propNamesAll, dataTrain);
return(retProp);
}
decisionTree = trainDecisionTree(dataTrain)
#print(decisionTree)


library("rpart")
library("rpart.plot")
dataTrain <- read.csv("xiguadata.csv", header = TRUE)
print(dataTrain)
fit <- rpart(HaoGua~.,data=dataTrain,control = rpart.control(minsplit = 1, minbucket = 1),method="class")
printcp(fit)

rpart.plot(fit, branch = 1, branch.type = 1, type = 2, extra = 102,shadow.col='gray', box.col='green',border.col='blue', split.col='red',main="DecisionTree")

#library(jsonlite)
#dataJson = toJSON(decisionTree)
#c <- file( "result.txt", "w" )
#writeLines(dataJson, c )
#close( c ) #这里需要主动关闭文件

#for (k in propNames) {
# eachData <- dataSet[c(k)]
# values <- table(unlist(eachData))# 频次汇总
# #print(values)
# print(k)
# total <- 0
# for (m in names(values)) {
# #print(m)
# #print(values[m][1])
# data3 <- subset(dataSet, dataSet[c(k)] == m)
# entropyDv <- calEntropy(data3[, length(data3)])
# #print(entropyDv)
# total = total + entropyDv*values[c(m)][1]
# }
# GainDv <- entropyD - total / sum(values);+
# print(GainDv)
#}

R语言代码包含本人自己编写的R语言ID3算法,最后使用R的rpart包训练了一个决策树。

总结:

  • ID3算法简洁清晰,符合人类思路方式。
  • 决策树的解释性强,可视化后也方便理解模型和验证正确性。
  • ID3算法时候标签类特征的样本,对应具有连续型数值的特征,无法运行此算法。
  • 有过拟合的风险,要通过剪枝来避免过拟合。
  • 信息增益有时候偏爱属性很多的特征,C4.5和CART算法可以对此有优化。
  • 这是我的github主页https://github.com/fanchy,有些有意思的分享。
  • python相比R语言写起来还是溜多了,主要是遍历和嵌套,python比R要容易很多,R的数据筛选和选择方便一点,这个python版本的id3算法写的还是很清晰简洁的 正是Talk is cheap. Show me the code。这是在网上可以看到原生实现版本中,最精简的版本之一。

对应的西瓜书数据集为

色泽  根蒂  敲声  纹理  脐部  触感  HaoGua
青绿 蜷缩 浊响 清晰 凹陷 硬滑 是
乌黑 蜷缩 沉闷 清晰 凹陷 硬滑 是
乌黑 蜷缩 浊响 清晰 凹陷 硬滑 是
青绿 蜷缩 沉闷 清晰 凹陷 硬滑 是
浅白 蜷缩 浊响 清晰 凹陷 硬滑 是
青绿 稍蜷 浊响 清晰 稍凹 软粘 是
乌黑 稍蜷 浊响 稍糊 稍凹 软粘 是
乌黑 稍蜷 浊响 清晰 稍凹 硬滑 是
乌黑 稍蜷 沉闷 稍糊 稍凹 硬滑 否
青绿 硬挺 清脆 清晰 平坦 软粘 否
浅白 硬挺 清脆 模糊 平坦 硬滑 否
浅白 蜷缩 浊响 模糊 平坦 软粘 否
青绿 稍蜷 浊响 稍糊 凹陷 硬滑 否
浅白 稍蜷 沉闷 稍糊 凹陷 硬滑 否
乌黑 稍蜷 浊响 清晰 稍凹 软粘 否
浅白 蜷缩 浊响 模糊 平坦 硬滑 否
青绿 蜷缩 沉闷 稍糊 稍凹 硬滑 否

立即购买:淘宝网]]>
IT技术精华 2019-08-19 23:08:07
14623 FEX 技术周刊 - 2019/08/12 深阅读

React v16.9.0 and the Roadmap Update
https://reactjs.org/blog/2019/08/08/react-v16.9.0.html
It contains several new features, bugfixes, and new deprecation warnings to help prepare for a future major release. New Deprecations: Renaming Unsafe Lifecycle Methods; Deprecating javascript: URLs. New Features: Async act() for Testing; Performance Measurements with <React.Profiler>;

Technical vision for Qt 6
https://blog.qt.io/blog/2019/08/07/technical-vision-qt-6/
Below are some of the key changes we need to make in Qt to make it fit for the next years to come: Next-generation QML, Next-generation graphics, Unified and consistent tooling.

Intl.NumberFormat
https://v8.dev/features/intl-numberformat
You might already be familiar with the Intl.NumberFormat API, as it’s been supported across modern environments for a while now. In its most basic form, Intl.NumberFormat lets you create a reusable formatter instance that supports locale-aware number formatting.

Apollo Client, now with React Hooks
https://blog.apollographql.com/apollo-client-now-with-react-hooks-676d116eeae2
A new, slimmer API designed for modern React. Apollo Client now includes three hooks that you can drop into your app, anywhere you currently use a corresponding higher-order component or render prop component: useQuery, useMutation, and useSubscription. These hooks are simple to get started with, and they have many advantages over the previous APIs, including bundle size reductions and less boilerplate code.

5 Tips to Help You Avoid React Hooks Pitfalls
https://kentcdodds.com/blog/react-hooks-pitfalls
Let’s look a bit at what pitfalls you could come across and how you can change your thinking so you avoid them.

  • Pitfall 1: Starting without a good foundation;
  • Pitfall 2: Not using (or ignoring) the ESLint plugin;
  • Pitfall 3: Thinking in Lifecycles;
  • Pitfall 4: Overthinking performance;
  • Pitfall 5: Overthinking the testing of hooks.

Serverless For Frontend 前世今生
https://www.yuque.com/egg/nodejs/sff-history
作为一个前端,你可能一直在迷茫,Node.js 的定位是什么?为什么我们需要它?尤其是到了 2019 这个时间点,未来一段时间内,有一个词 – Serverless 你会听到想吐。所有人都在说 Serverless,几乎没有人知道如何落地 Serverless 但大家都觉得其他人在大力做 Serverless,所以大家都在宣传自己在做 Serverless。阿里作为 Node.js 国内的引航者,在该领域深度实践多年。在国内第一个引入 BFF 的概念,现在也是第一个提出 SFF(Serverless For Frontend)。笔者过去几年有幸参与到该演化进程中,在此分享给大家一些心得,抛砖引玉。

[译]摆脱JS框架,5年web组件开发经验总结
https://mp.weixin.qq.com/s?__biz=MzUxMzcxMzE5Ng==&mid=2247492121&idx=1&sn=866af57bc412fa277e28721e0d486168
Web 组件的出现让开发者可以使用 HTML、CSS 和 JavaScript 创建可复用的组件。这意味着无需使用框架也能创建组件。本文作者与大家分享了在零框架下,近五年来只使用web组件开发的经验。

抖音研发实践:基于二进制文件重排的解决方案 APP启动速度提升超15%
https://mp.weixin.qq.com/s/Drmmx5JtjG3UtTFksL6Q8Q
启动是App给用户的第一印象,对用户体验至关重要。抖音的业务迭代迅速,如果放任不管,启动速度会一点点劣化。为此抖音iOS客户端团队做了大量优化工作,除了传统的修改业务代码方式,我们还做了些开拓性的探索,发现修改代码在二进制文件的布局可以提高启动性能,方案落地后在抖音上启动速度提高了约15%。本文从原理出发,介绍了我们是如何通过静态扫描和运行时trace找到启动时候调用的函数,然后修改编译参数完成二进制文件的重新排布。

[译]WebAssembly - JS 的未来和 Web 多语言开发
https://juejin.im/post/5d4b17b0f265da03c926e436
这是一个由 simviso 团队对 JSConf.Asia 中关于 WebAssembly 相关话题进行翻译的文档,内容并非直译,其中有一些是译者自身的思考。分享者是 Kas Perch,Cloudflare 的一名开发人员。现在,让我们一起来了解下什么是 WebAssembly。另附:「2019 JSConf.Asia - 尤雨溪」在框架设计中寻求平衡

Native lazy-loading for the web
https://web.dev/native-lazy-loading
Browser-level native lazy-loading is finally here! As of Chrome 76 (available now), you can use the loading attribute to natively lazy load resources, without the need for custom code or a separate JS library. This post dives into the details.

Writing Modes And CSS Layout
https://www.smashingmagazine.com/2019/08/writing-modes-layout/
An understanding of CSS Writing Modes is useful if you want to work with vertical scripts, or change writing mode for creative reasons. However, they also underpin our new layout methods, and those ideas are increasingly being applied across all of CSS. In this article, find out why Rachel Andrew believes understanding writing modes is so important.

Design Principles for Developers: Processes and CSS Tips for Better Web Design
https://css-tricks.com/design-principles-for-developers-processes-and-css-tips-for-better-web-design/
Whenever you use HTML and CSS, you are designing—giving form and structure to content so it can be understood by someone else. People have been designing for centuries and have developed principles along the way that are applicable to digital interfaces today. These principles manifest in three key areas: how words are displayed (typography), how content is arranged (spacing), and how personalty is added (color). Let’s discover how to use each of these web design ingredients through the mindset of a developer with CSS properties and guidelines to take the guesswork out of web design.

Introducing Storybook Design System
https://medium.com/storybookjs/introducing-storybook-design-system-23fd9b1ac3c0
A reusable UI component library for Storybook contributors

How To Get Started With IPFS and Node
https://medium.com/better-programming/how-to-get-started-with-ipfs-and-node-fa04baec6b3a
Learn what the InterPlanetary File System is and how to store data with it.

Build Your Own Video Chat with Vue, WebRTC, SocketIO, Node & Redis
https://levelup.gitconnected.com/build-your-own-video-chat-with-vue-webrtc-socketio-node-redis-eb51b78f9f55
Nowadays, there are plenty of free applications out there in the market providing chat and video conference functionality. In almost one click we all are able to communicate with anyone in any part of the world but, why don’t we try to build our own app to make it even more real? Let’s do it!

Regex For Noobs (like me!) - An Illustrated Guide
https://www.janmeppe.com/blog/regex-for-noobs/
This blog post is an illustrated guide to regex and aims to provide a gentle introduction for people who never have fiddled with regex, want to, but are kind of intimidated by the whole thing.

Welcome to Serverless 2.0
https://www.youtube.com/watch?v=JvXm-oHi5Mg
This talk from the creator of OpenFaaS and CNCF Serverless workgroup member Alex Ellis explores what hype actually means for our industry and why it matters so much for serverless right now. You’ll learn why ThoughtWorks says multi-cloud portability is best achieved through containers and learn more about how the industry is shifting towards Kubernetes for container management. This has lead to what Alex is coining Serverless 2.0 - a truly portable experience built on battle-tested technology.

Stack Overflow: How We Do App Caching - 2019 Edition
https://nickcraver.com/blog/2019/08/06/stack-overflow-how-we-do-app-caching/
So…caching. What is it? It’s a way to get a quick payoff by not re-calculating or fetching things over and over, resulting in performance and cost wins. That’s even where the name comes from, it’s a short form of the “ca-ching!” cash register sound from the dark ages of 2014 when physical currency was still a thing, before Apple Pay. I’m a dad now, deal with it. Let’s say we need to call an API or query a database server or just take a bajillion numbers (Google says that’s an actual word, I checked) and add them up. Those are all relatively crazy expensive. So we cache the result – we keep it handy for re-use.

Unifying visual embeddings for visual search at Pinterest
https://medium.com/pinterest-engineering/unifying-visual-embeddings-for-visual-search-at-pinterest-74ea7ea103f0
In 2017, we launched Lens camera search, opening up Pinterest’s visual search to the real world by turning every Pinner’s phone camera into a powerful discovery system. And in 2019, we’ve launched Automated Shop the Look so Pinners can find and shop for exact products within a Pinterest home-decor scene. Visual search is one of the fastest-growing products at Pinterest with hundreds of millions of searches per month.

OpenCensus Web: Unlocking Full End-to-End Observability for Your Entire Stack
https://opensource.googleblog.com/2019/08/opencensus-web-unlocking-full-end-to.html
OpenCensus Web is a tool to trace and monitor the user-perceived performance of your web pages. It can help determine whether or not your web pages are experiencing performance issues that you might otherwise not know how to diagnose.

Life After Hadoop: Getting Data Science to Work for Your Business
https://towardsdatascience.com/life-after-hadoop-getting-data-science-to-work-for-your-business-c9ab6605733f
Traditional approaches like Apache Hadoop and CPU-based infrastructure aren’t up to the task — they pose too many bottlenecks and are too slow. NVIDIA and its partners are building GPU-based tools that let businesses unleash data science using AI. Below are five key elements to building AI-optimized data centers, but first let’s take a look at the crossroads data science is at.

Build Your Own Text Editor
https://viewsourcecode.org/snaptoken/kilo/index.html
Welcome! This is an instruction booklet that shows you how to build a text editor in C. The text editor is antirez’s kilo, with some changes.mI explain each step along the way, sometimes in a lot of detail. Feel free to skim or skip the prose, as the main point of this is that you are going to build a text editor from scratch! Anything you learn along the way is bonus, and there’s plenty to learn just from typing in the changes to the code and observing the results.

What Every Developer Should Learn Early On
https://stackoverflow.blog/2019/08/07/what-every-developer-should-learn-early-on/
Languages aren’t necessarily “Good” or “Bad”; Reading Other People’s Code is Hard; You’ll Never Write “Perfect” Code; Working as a Programmer Doesn’t Mean 8 Hours of Programming a Day.

How to Make Your Open Source Project Successful
https://dmitripavlutin.com/how-to-make-your-open-source-project-successful/
I’ve built an open source library vocajs.com that managed to rise to the top trending repositories on GitHub. Along the way, I learned some important principles of how to make a quality open source project. I want to share these ideas:

    1. No one cares about your project
    1. Solve a real problem
    1. Put an accent on quality
    1. Excellent README.md and documentation
    1. Showcase with demos and screenshots
    1. Try building a community
    1. Let the world know

All the best engineering advice I stole from non-technical people
https://medium.com/@bellmar/all-the-best-engineering-advice-i-stole-from-non-technical-people-eb7f90ca2f5f
More often than not the best advice, the things that stuck with me, came from people who had no background at all in software. It’s intriguing that the stuff that really seems to make a difference in the quality of software never seems to be about software. These are five of my favorites: 1. “People like us make our money in the seams of things”; 2. “Know what people are asking you to be an expert in”; 3. “Before you can make things better, you have to stop making them worse”; 4. “To go left, turn right”; 5. “Thinking is also work”.

新鲜货

DC_OS - The Definitive Platform for Modern Apps
https://dcos.io/
DC/OS (the Distributed Cloud Operating System) is an open-source, distributed operating system based on the Apache Mesos distributed systems kernel. DC/OS manages multiple machines in the cloud or on-premises from a single interface; deploys containers, distributed services, and legacy applications into those machines; and provides networking, service discovery and resource management to keep the services running and communicating with each other.

2019 Google 中国开发者大会报名
https://mp.weixin.qq.com/s?__biz=MzAwODY4OTk2Mg==&mid=2652050101&idx=1&sn=c4fc3a6e7780daf4dbbc49b01d7c5204
2019 谷歌开发者大会将于 9 月 10 日和 11 日在上海举办,感兴趣的同学可参加。

GitHub Actions now supports CI/CD, free for public repositories
https://github.blog/2019-08-08-github-actions-now-supports-ci-cd/
Since we introduced GitHub Actions last year, the response has been phenomenal, and developers have created thousands of inspired workflows. But we’ve also heard clear feedback from almost everyone: you want CI/CD! And that’s what we’re announcing today. We hope you’ll try out the beta before GitHub Actions is generally available on November 13. We can’t wait to hear what you think!

Chrome 77 Beta
https://blog.chromium.org/2019/08/chrome-77-beta-new-performance-metrics.html
New performance metrics: Largest Contentful Paint, new form capabilities, capabilities in origin trials and more…

ES proposal: globalThis
https://2ality.com/2019/08/global-this.html
The ECMAScript proposal “globalThis” by Jordan Harband provides a new standard way of accessing the global object.

Data Structures and Algorithms in JavaScript
https://github.com/amejiarosario/dsa.js-data-structures-algorithms-javascript
This is the coding implementations of the DSA.js book and the repo for the NPM package. In this repository, you can find the implementation of algorithms and data structures in JavaScript. This material can be used as a reference manual for developers, or you can refresh specific topics before an interview. Also, you can find ideas to solve problems more efficiently.

AnyChart 8.7.0 with New Awesome Features!
https://www.anychart.com/blog/2019/08/08/javascript-graph-visualization-libraries-new-features-anychart/
Client-Side Export, Stock UI Controls: Swap and Maximize, Stock Plot Splitter, Infinite Range Annotations…

Lightning Web Components
https://lwc.dev/
A Blazing Fast, Enterprise-Grade Web Components Foundation. Lightning web components are custom elements built using HTML and modern JavaScript. The Lightning Web Components UI framework uses core Web Components standards and provides only what’s necessary to perform well in browsers. Because it’s built on code that runs natively in browsers, the framework is lightweight and delivers exceptional performance. Most of the code you write is standard JavaScript and HTML.

react-archer
https://github.com/pierpo/react-archer
Draw arrows between DOM elements in React.

A History of Amazon Web Services (AWS)
https://www.awsgeek.com/pages/AWS-History/
A history of AWS service announcements (either pre-announced or in some form of limited preview, like beta) and releases (generally available in one or more AWS regions).

MC.JS
https://github.com/ian13456/mc.js
Open source Minecraft clone built with ThreeJS, ReactJS, GraphQL, and NodeJS. MC.JS brings the best-selling PC game “Minecraft” into the web with the power of javascript.

Photoronoi
https://wattenberger.com/photoronoi
Upload an image or grab from a url to turn an image into a voronoi SVG. Darker parts of the image result in smaller polygons - play around with the settings to see what works best for your picture! You might try removing the background of your image using remove.bg.

Silk
http://weavesilk.com/
Interactive generative art.

Pkg Stats
https://www.pkgstats.com/
npm package discovery and stats viewer.

Rust Language Cheat Sheet
https://cheats.rs/
Rust 语言就是这样一门哲学内涵丰富的编程语言。通过了解 Rust 遵循什么样的设计哲学,进一步了解它的语法结构和编程理念,就可以系统地掌握这门语言的核心,而不至于在其纷繁复杂的语法细节中迷失。另附:想要改变世界的 Rust 语言

设计

To Make Apps Accessible, Make Them Compatible with Different Devices
https://medium.com/google-design/to-make-apps-accessible-make-them-compatible-with-different-devices-11298c6d3f06
How to design with old operating systems, varying contrast, low battery life, and damaged screens in mind.

The Complete Guide to UX Research Methods
https://www.toptal.com/designers/user-research/guide-to-ux-research-methods
User experience (UX) design is the process of designing products that are useful, easy to use, and a pleasure to engage. It’s about enhancing the entire experience people have while interacting with a product and making sure they find value, satisfaction, and delight. If a mountain peak represents that goal, employing various UX research methods are the path UX designers use to get to the top of the mountain.

Gung Ho Predictions About the Future of AI: UX, Psychology, and Advertising
https://uxplanet.org/not-so-gung-ho-predictions-about-the-future-of-ai-ux-psychology-and-advertising-3c47f96c6ef4
As you surf the web for news and insight, AI is learning what you like to read, how long you like to read for, and it probably even knows if you’re one of those people that read articles with your mouse cursor. And with that information that AI learns, companies are turning a profit. AI software like Google Ads and Facebook Ads select advertisements to show you based upon your habits. And I believe these AI are only going to become more advanced.

产品及其它

解散匿名社交“一罐”后,创始人复盘:十倍目标是万恶之源
https://new.qq.com/omn/20190806/20190806A0E6S800.html
一罐创始人纯银整理了创业7年的教训:1)原型测试是万物之源;2)十倍目标是万恶之源;3)战略、策略与同伴。不讲big story吧,融不到资也组不了队,讲big story吧,又会因为过于激进而大涨仆街概率。难道真的没有解法吗?」解法是有的,我造了个词叫「原生创业家庭」,意思是满足如下四个条件的创业团队:天使轮融资;核心团队完整地覆盖产品研发运营;核心团队靠老交情黏合在一起创业;核心团队第一次创业。

关于字节跳动的神话与现实
https://mp.weixin.qq.com/s?__biz=MzIxMDgyMTM0NQ==&mid=2247488047&idx=1&sn=42f0cd0865c325b0ff007b7367d028cd
字节跳动,百度的 “威力加强版”:虽然媒体热衷于炒作“头腾大战”,字节跳动最贴切的对标其实是百度。它具备百度巅峰期的优点(技术领先、销售强势),又避免了导致百度衰落的缺点(不重视用户体验、执行力低下),从而最大限度的蚕食了百度的蛋糕。字节跳动的崛起史就是百度的坍塌史,而且这个过程还在持续;当然,微博、快手也从这个过程中受益了。

“新物种爆炸·吴声商业方法发布2019”3小时全文实录
https://mp.weixin.qq.com/s?__biz=MzUyMDQ5NzI5Mg==&mid=2247510117&idx=2&sn=336326b443000dae1f63347cc78e9d16
声3小时不间断演讲“预测”依旧,首发“场景算法”解码数字商业新时代。场景实验室创始人吴声,以“真假吴声”对话开场,引出对数字生活态度和数字文明规则的探讨。他认为,随着“天生边缘、自组织、开放协作”的00后数字化原住民入场,商业规则正在被重新制定,所以我们说“数字商业正年轻”。场景算法是数字化商业的操作系统,系统化探寻年轻商业可能性。

对数据可视化有兴趣的可以阅读由 AntV 带来的 墨者修齐 2019-08-12·以人为本的正交网络布局、ggplot的台风可视化、字体关系分析


立即购买:淘宝网]]>
IT技术精华 2019-08-19 16:08:12
14622 FEX 技术周刊 - 2019/08/19 深阅读

V8 release v7.7
https://v8.dev/blog/v8-release-77
Performance (size & speed): Lazy feedback allocation, Scalable WebAssembly background compilation, Stack trace improvements. JavaScript language features: The Intl.NumberFormat API.

Introducing the New React DevTools
https://reactjs.org/blog/2019/08/15/new-react-devtools.html
A lot has changed in version 4! At a high level, this new version should offer significant performance gains and an improved navigation experience. It also offers full support for React Hooks, including inspecting nested objects.

JavaScript and Node Testing Best Practices
https://github.com/goldbergyoni/javascript-testing-best-practices
Almost 50 best practices divided into categories (backend, frontend, CI, etc.) complete with code examples. Not just the basics, it digs into areas like visual regression, property-based testing, and contract testing, too.

useReducer + useContext for easy global state without libraries
https://swizec.com/blog/usereducer-usecontext-for-easy-global-state-without-libraries/swizec/9182
Ok so global state management, how do you do it? You grab Redux or MobX, or maybe Unstated or Constate and you’re done. Right? All those libraries give you a global store to put your data, allow every component to access that state, and manage re-renders. You get some way of updating that state and life is good. Sometimes there’s extra features. But what if you don’t want to learn yet another library? You just wanna share some state between a bunch of components. What then?

The history and legacy of jQuery
https://blog.logrocket.com/the-history-and-legacy-of-jquery/
jQuery may have fallen somewhat out of favor in web development, but it still powers an estimated 74 percent of sites and paved the way for modern web frameworks. (In recent polls we’ve done, many people are still actively chosing to use it too. Long live jQuery!)

[译]愿未来没有 Webpack
https://juejin.im/post/5d4bcdb7e51d453b386a62c6
作为一名身处 2019 年的 JavaScript 开发者,我也有同感。我们明明已经拥有了这个崭新的 JavaScript 模块系统(ESM),它可以直接在 Web 环境中运行。可每次开发点什么,我们还是得用打包工具处理一下。这到底为什么?在过去的几年里,JavaScript 打包界的炙手可热已经从只优化生产环境转变到了逢开发必打包的程度。不论你喜欢与否,都很难否认打包工具给 Web 开发带来了变态级别的复杂性,而 Web 开发明明是一个一贯以源码可见和轻松上手的精神为荣的领域啊。@pika/web 试图将 Web 开发者从打包地狱中解救出来。都 2019 年了,你使用打包工具应该是因为你想要用,而不是因为你不得不用

重塑前端工作流-Lugia正式版发布
https://zhuanlan.zhihu.com/p/77255855
从 2018 年 5月建立第一个 Lugia 相关的仓库起,时隔一年多,22万+的代码量,3500+次的提交,Lugia 终于迎来了它的第一个正式版本 1.1.0。在今年年初,我们已经在内部发布了技术预览版 1.0.0,以验证我们的整套大前端解决方案,又经过半年多的时间打磨,我们决定在社区分享我们的成果。Lugia 是一整套面向云原生化大前端生态解决方案。我们希望把交互设计与前端应用代码开发有机的融为一体,形成一种跨时代的大前端生态技术规范:Lugia Design; LugiaX; Lugia Web; Lugia Mega:标准、高效、开箱即用的前端可视化设计开发工具。Lugia Mega 是一个无需环境搭建、快速上手的跨平台桌面应用(Mac 和 Windows)。研发制定了元信息中间语言描述规范LugiaD,为开发人员提供可视化交互设计、屏蔽底层开发工具链和底层前端开发框架(React OR VUE)、以及元信息注册式的开发方式。帮助设计师、产品经理快速设计产品,成果可以直接让开发人员使用。Lugia Mega 贯穿了整个项目的生命周期,让您极速构建前端应用、轻松管理所有项目。

云凤蝶中台研发提效实践
https://zhuanlan.zhihu.com/p/78425921
最近我们在蚂蚁内部发布了全新云凤蝶 2.0,把产品的重点由 H5 搭建彻底转向了中台方向。使用云凤蝶,快速制作高品质中台应用。我们目前聚焦于以下三个方面来服务中台业务:

  • 降门槛 让更多人进的来参与中台建设
  • 提效 是否可以做到 10 倍提效?
  • 提升体验 设计规范自动化落地,默认好用
    本文主要探讨云凤蝶对于中台提效的理解,从研发模式的角度来看,我们对于十倍提效的达成思路。另附:我的一年中台实战录

Visual Studio Code有哪些工程方面的亮点
https://zhuanlan.zhihu.com/vs-code
Visual Studio Code(VS Code)近年来获得了爆炸式增长,成为广大开发者工具库中的必备神器。它作为一个开源项目,也吸引了无数第三方开发者和终端用户,成为顶尖开源项目之一。它在功能上做到了够用,体验上做到了好用,更在拥有海量插件的情况下做到了简洁流畅,实属难能可贵。我是VS Code用户,同时也为它开发插件,插件市场里的众多Java插件基本都是我们团队的作品,所以我在日常工作中观察到不少VS Code在工程方面的亮点,下面就来逐一探讨。

大BU级别的”前后端分离”实践
https://segmentfault.com/a/1190000020047069
随着部门内前端的业务线和平台越来越多,前端的职责也逐渐加重,随之而来的就是各种问题和挑战。目前前端团队共有31个人,共负责15+业务/项目和平台,前端项目的总PV最低也在2000万以上,由于是工具类型的应用,MAU(月活用户)也有1亿以上。面对这么大的用户体量和业务压力,团队在开发和维护的过程中也逐渐遇到了各种问题。首先是基础设施的问题,没有完善且统一的标准规范和设施,导致每个项目的技术栈和实现思路各不相同,功能复用率不高。在成员提升方面,这么多的成员如何让他们在技术和解决问题的能力上都有所提升。另外还有其他角色更加关注的效能提升的问题,包括前端开发效率的整体提升,以及上下游协作效率的提升。最后是整体稳定性方面的保证,需要在第一时间发现错误和体验相关的问题,这又是一个很重要的问题。另附:技术架构与组织结构的演变路径

是时候了,无外链的CSS开发策略
https://www.zhangxinxu.com/wordpress/2019/08/css-no-external-link/
想想看现在都什么年代了,我家小朋友都已经会打酱油了,CSS中的一些开发策略也需要发生改变了。如果你的项目不需要兼容IE8浏览器,则试试贯彻下面这条CSS开发策略。CSS代码中无外链!也就是不要有任何的http/https请求从CSS文件中发出。

The (Upcoming) WordPress Renaissance
https://www.smashingmagazine.com/2019/08/upcoming-wordpress-renaissance/
Since its release 8 months ago, Gutenberg has been greatly improved, offering a user experience much richer than anything that was possible in WordPress. Let’s take a look at its latest developments, and where it is heading to.

The Differing Perspectives on CSS-in-JS
https://css-tricks.com/the-differing-perspectives-on-css-in-js/
Some people outright hate the idea of CSS-in-JS. Just that name is offensive. Hard no. Styling doesn’t belong in JavaScript, it belongs in CSS, a thing that already exists and that browsers are optimized to use. Separation of concerns. Anything else is a laughable misstep, a sign of not learning from the mistakes of the past (like the <font> tag and such.) Some people outright love the idea of CSS-in-JS. The co-location of templates and functionality, à la most JavaScript frameworks, has proven successful to them, so wrapping in styles seems like a natural fit. Vue’s single file components are an archetype here. 另附:You Don’t Need CSS-in-JS: Why (and When) I Use Stylesheets Instead

The traits of serverless architecture
https://www.thoughtworks.com/insights/blog/traits-serverless-architecture
This article doesn’t aim to help you understand all of the topics in-depth but to give you a general overview of what you are in for. These are the traits of serverless architecture defined in this article: Low barrier-to-entry, Hostless, Stateless, Elasticity, Distributed, Event-driven.

The (not so) hidden cost of sharing code between iOS and Android
https://blogs.dropbox.com/tech/2019/08/the-not-so-hidden-cost-of-sharing-code-between-ios-and-android/
Although writing code once sounds like a great bargain, the associated overhead made the cost of this approach outweigh the benefits (which turned out to be smaller than expected anyway). In the end we no longer share mobile code via C++ (or any other non-standard way) and instead write code in the platform native languages. In addition we want our engineers to have a delightful experience and to be able to contribute back to the community. This is why we made the decision to align our practices with industry standards.

Data Hub: A Generalized Metadata Search & Discovery Tool
https://engineering.linkedin.com/blog/2019/data-hub
As the operator of the world’s largest professional network and the Economic Graph, LinkedIn’s Data team is constantly working on scaling its infrastructure to meet the demands of our ever-growing big data ecosystem. As the data grows in volume and richness, it becomes increasingly challenging for data scientists and engineers to discover the data assets available, understand their provenances, and take appropriate actions based on the insights. To help us continue scaling productivity and innovation in data alongside this growth, we created a generalized metadata search and discovery tool, Data Hub.

Static Analysis at Scale: An Instagram Story
https://instagram-engineering.com/static-analysis-at-scale-an-instagram-story-8f498ab71a0c
This post is about how we’ve used linting and automated refactoring to help manage the scale of our Python codebase. In the next few weeks, we’ll share more details of other tools and techniques we’ve developed to manage other aspects of our codebase’s quality.

I wasn’t getting hired as a Data Scientist. So I sought data on who is
https://towardsdatascience.com/i-wasnt-getting-hired-as-a-data-scientist-so-i-sought-data-on-who-is-c59afd7d56f5
I wasn’t getting hired as a Data Scientist. So I sought data on who is. Instead of focusing on skills thought to be required of data scientists, we can look at what they have actually done before.

Python is eating the world: How one developer’s side project became the hottest programming language on the planet
https://www.zdnet.com/article/python-is-eating-the-world-how-one-developers-side-project-became-the-hottest-programming-language-on-the-planet/
Frustrated by programming language shortcomings, Guido van Rossum created Python. With the language now used by millions, Nick Heath talks to van Rossum about Python’s past and explores what’s next. 另附:Is Python Strangling R to Death?.

新鲜货

aiXcoder
https://www.aixcoder.com/#/
https://mp.weixin.qq.com/s/CjRp3zCYyk_LP4wqxhLPDw

智能编程助手:

  • 智能代码提示她用强大的深度学习引擎,能给出更加精确的代码提示;
  • 代码风格检查她有代码风格智能检查能力,帮助开发者改善代码质量;
  • 编程模式学习她能自主学习开发者的编程模式,边用边学,越用越强;

MongoDB 4.2 is now GA: Ready for your Production Apps
https://www.mongodb.com/blog/post/mongodb-42-is-now-ga-ready-for-your-production-apps
Key highlights of MongoDB 4.2 include:

  • Distributed Transactions
  • On-Demand Materialized Views
  • Wildcard Indexes
  • MongoDB Query Language
  • Retryable Reads and Writes

Highlights from Git 2.23
https://github.blog/2019-08-16-highlights-from-git-2-23/
The open source Git project just released Git 2.23 with features and bug fixes from over 77 contributors, 26 of them new. Here’s our look at some of the most exciting features and changes introduced since Git 2.22. 另附:Use GitHub Classroom with Canvas, Google Classroom, or your own toolsGitHub intern project: Localization for Learning Lab.

npm CLI Roadmap - Summer 2019
https://blog.npmjs.org/post/186983646370/npm-cli-roadmap-summer-2019
Motion on the npm CLI project has been accelerating, and we’re now moving forward with a clear direction and vision. This document outlines what’s in store for the remainder of the npm v6 line, and what to expect in v7 and v8.

Announcing the Ionic React Release Candidate!
https://ionicframework.com/blog/announcing-ionic-react-release-candidate/
Ionic React RC marks the first major release of our vision to bring Ionic development to more developers on other frameworks. This was made possible by Ionic v4.0, which was completely re-written from the ground up focusing on web standards and not dependent on a particular framework. Ionic v4.0 makes it possible for us to target many frameworks while still having our core components be a single code base shared across all these frameworks.

Divjoy The React codebase generator
https://divjoy.com/
Use our free web-based tool to create the perfect codebase for your next project.

  • Choose your stack: UI Kit, Framework, Authentication.
  • Choose a template: Everything you need is included. Marketing pages, contact forms, pricing, faq, authentication, client-side routing. Even the forgot password flow works. You’ll never need to build that stuff again.

Lottie for Web
https://github.com/airbnb/lottie-web
Lottie is a mobile library for Web, and iOS that parses Adobe After Effects animations exported as json with Bodymovin and renders them natively on mobile! For the first time, designers can create and ship beautiful animations without an engineer painstakingly recreating it by hand.

Planet - Imaging the entire Earth, every day
https://nbremer.github.io/planet-globe/
The visual below shows the locations of the photos that were taken on January 25th, 2018 by Planet’s satellites (the white circles) which are continuously circling the Earth. The blue circle randomly picks a satellite to follow every few seconds. You can see some information about its location and speed (yes, that’s kilometers per second) in the lower right corner.

sqliterally - Lightweight SQL query builder
https://github.com/terkelg/sqliterally
SQLiterally makes it easy to compose safe parameterized SQL queries using template literals. Clauses are automatically arranged which means you can re-use, subquery and append new clauses as you like – order doesn’t matter. All queries are well formatted and ready to be passed directly to node-pg and mysql. Use SQLiterally as a lightweight alternative to extensive query builders like Knex.js or when big ORMs are over-kill.

Acorn
https://github.com/acornjs/acorn
A tiny, fast JavaScript parser, written completely in JavaScript.

NPKILL
https://github.com/voidcosmos/npkill
This tool allows you to list any node_modules directories in your system, as well as the space they take up. You can then select which ones you want to erase to free up space.

CutiePi tablet
https://cutiepi.io/index.html
CutiePi is an all-in-one Raspberry Pi tablet. Take your Raspberry Pi project, liberate it from the desk, and start creating wherever you go.

Anime4K
https://github.com/bloc97/Anime4K
Anime4K is a state-of-the-art*, open-source, high-quality real-time anime upscaling algorithm that can be implemented in any programming language, anywhere.

A Readable Specification of TLS 1.3
https://davidwong.fr/tls13/
The primary goal of TLS is to provide a secure channel between two communicating peers; the only requirement from the underlying transport is a reliable, in-order data stream. Specifically, the secure channel should provide the following properties: Authentication, Confidentiality, Integrity.

Ultimate Go
https://github.com/hoanhan101/ultimate-go
This repo contains my notes on learning Go and computer systems. Different people have different learning style. For me, I learn best by doing and walking through examples. Hence, I am trying to take notes carefully and comment directly on the source code, rather than writing up Markdown files. That way, I can understand every single line of code as I am reading and also be mindful of the theories behind the scene. In the mix, I also include links to other articles that I find helpful.

What is Paged Out!?
https://pagedout.institute/
Paged Out! is a new experimental (one article == one page) free magazine about programming (especially programming tricks!), hacking, security hacking, retro computers, modern computers, electronics, demoscene, and other similar topics. It’s made by the community for the community - the project is led by Gynvael Coldwind with multiple folks helping. And it’s not-for-profit (though in time we hope it will be self-sustained) - this means that the issues will always be free to download, share and print.

设计

白鸦 - 企业服务类产品的底层逻辑,和“有赞产品设计原则”
https://mp.weixin.qq.com/s/-kLQPDU-9337mN1ebjuYqA
有赞的《产品设计原则》,根据客户需求、有赞的使命和愿景、当前生态环境,以及我们所处的发展阶段拟定,它是每个有赞产品在设计过程中都要遵守的基本原则。我们还会定期对其进行优化和迭代。它是一个产品视角的原则,并非完整的市场、运营或者技术视角。在产品视角上,我们把产品设计过程分成了 4 个部分:产品定义、产品设计、产品研发、产品运营。

Cabana 3.0. Is it a bird, a plane, a Design System, or a UI Starter Kit?
https://medium.com/sketch-app-sources/cabana-3-0-is-it-a-bird-a-plane-a-design-system-or-a-ui-starter-kit-a61982352e77
A super-helpful tool when creating UIs inside of Sketch.

Future by Design
https://medium.com/microsoft-design/future-by-design-7f32e4d6b6d1
How Design Day 2019 helps designers in a business evolve into businesspeople who Design.

Information on Display: New Features and More Accessible Data Tables
https://medium.com/google-design/information-on-display-new-features-and-more-accessible-data-tables-8d4025ddcb57
Data tables display sets of similar information so that it’s easy to scan — often ordered in a hierarchical or alphabetical way that helps users find patterns and insights. A variety of interactive elements (like selecting rows and column sort) support a range of use cases demonstrated in our design guidance.

产品及其它

Get your work recognized: write a brag document
https://jvns.ca/blog/brag-documents/
There’s this idea that, if you do great work at your job, people will (or should!) automatically recognize that work and reward you for it with promotions / increased pay. In practice, it’s often more complicated than that – some kinds of important work are more visible/memorable than others. It’s frustrating to have done something really important and later realize that you didn’t get rewarded for it just because the people making the decision didn’t understand or remember what you did. So I want to talk about a tactic that I and lots of people I work with have used! This blog post isn’t just about being promoted or getting raises though. The ideas here have actually been more useful to me to help me reflect on themes in my work, what’s important to me, what I’m learning, and what I’d like to be doing differently. But they’ve definitely helped with promotions!

GitHub stars won’t pay your rent
https://medium.com/@kitze/github-stars-wont-pay-your-rent-8b348e12baed
It’s been a long time since I have written something here, but I don’t want to write articles for the sake of “keeping the blog alive”, screw that. Well, I finally have a story to tell. I finally launched the new version of Sizzy last month. It went from a simple web app to a full-fledged browser for designers and developers. I would say that it’s been a very exciting month, but actually, it’s been a bumpy ride for 2.5 years. I made a lot of mistakes and I learned a lot of lessons, so I wanted to share the entire story with you.

七年高速奔跑,字节跳动是靠什么文化机制运转起来的?
https://mp.weixin.qq.com/s/hc2y2nVGZLBw1XdlhCFqRg
一定程度上,字节跳动的管理成分里,除了谷歌开放坦诚的工作方式和Netflix强赋权的用人观,还有阿里的模式集权主义。它像一座巴别塔,以张一鸣为代表的公司高层,负责做出理性战略决策,公司其他组织快速进入执行轨道。字节跳动内部鲜少把员工的吐槽看成无理取闹。从传播学角度来看,它确实在“发布—反馈”机制中做到了信息多线流通,但他们内部更愿意把这一逻辑上升到践行公司价值观上——“ego小但格局大”。“ego小但格局大”并非字节跳动原创,但字节跳动把它用到了一种极致状态。在字节跳动全球30多个国家的办公室墙上都挂着同一组海报,其中一张海报的内容就叫“ego creates blind spots 自负制造盲点”。它脱胎于谷歌文化,ego代表自我,knowledge是知识,自我越小,知识越多,格局才越大。简单来说就是,别把自己看得太重,别搞办公室政治,多花点心思在能力提升和业务拓展上。

华为“鸿蒙”所涉及的微内核究竟是什么
https://mp.weixin.qq.com/s?__biz=MzIwMzA2NzI1Ng==&mid=2655158364&idx=1&sn=1958e29465e3af428f885bfc4ed1fa66
关于微内核的定义,这里有一份简单的描述:内核运行在内核态,只包含基本的多任务调度功能;其他系统服务都运行在用户态,包括文件系统,网络协议栈,甚至内存管理,驱动都是一个个独立的用户态进程,并相互做内存隔离。应用需要使用系统服务时,都通过IPC发送消息来使用其他用户态服务。而宏内核,用户应用是通过系统调用直接来使用系统服务。所以微内核,消息传递是基本形态。

对数据可视化有兴趣的可以阅读由 AntV 带来的 墨者修齐 2019-08-19·KeyLines Time Bar、地震可视化、艺术作品时序可视化

– THE END –


立即购买:淘宝网]]>
IT技术精华 2019-08-19 16:08:07
14621 免费版StatPlus在Mac Excel 2016 v16以上版本集成菜单安装:Excel 加载项 ==> 启用StatPlusMacAddin 上门课要用Excel的分析插件,但Excel Mac版缺省没有:需要安装AnalystSoft的StatPlus插件。标准的安装步骤如下:可以启用相应的Excel集成菜单。
1 从官网下载StatPlus for Mac安装包并解包
2 移动StatPlus到“应用程序”并打开StatPlus;
3 从StatPlus的菜单中选取SpreadSheet并启用Excel菜单集成模式:
Excel menu integration: Statplus mac

低版本的Excel(包括Microsoft Office Excel 2011 2013 2016 v15及以下)会重启Excel并显示集成菜单:
Mac Excel 2011: StatPlus 6

但如果Excel是2016 v16版本以上(包括Excel 2019)则会收到以下提示:

Latest Excel 2016 versions (v16+) require additional step to enable StatPlus integration. Please follow the 'Integration with Excel 2016 v16' steps from the registration letter or contact us if you have any questions.

其实注册邮件里面根本没有任何关于v16+的提示,官网也上没有任何关于Excel 2016 v16版本如何启用Excel菜单集成的帮助入口:通过在线帮助提交问题会收到一个手动启用操作方法的回复,按提示在Excel 2016 v16.22上安装设置验证如下:

启动Excel集成菜单步骤:打开Excel ==> 工具 ==> Excel 加载项 ==> 手动启用✔️StatPlusMacAddin模块
Excel 2016 v16+ additional step enable StatPlus integration
然后再次重启就可以在Excel的菜单中看到StatPlus工具栏了:
StatPlus Excel integrated menu



立即购买:淘宝网]]>
IT技术精华 2019-08-15 20:08:12
14620 Perl 5.22以后的语法更新:Unescaped left brace in regex is illegal here in regex; marked by <-- HERE in m/"%{ <-- HERE Ref 虚拟主机的操作系统升级后,perl的版本也升级到了5.26.x,awstats的统计中断了,从后台的错误信息:
Unescaped left brace in regex is illegal here in regex; marked by <-- HERE in m/"%{ <-- HERE Referer}i"/ at ./awstats.pl line 9045.

查了一下,此类信息应该是从5.22开始的:
As written by @Leobaillard you can use his patch. If you want to fix this with your editor (vi) you can go to line 3936 and make the following changes
$ vi buildroot/output/host/usr/bin/automake
goto line :3936 and change
$text =~ s/${([^ t=:+{}]+)}/substitute_ac_subst_variables_worker ($1)/ge;
to
$text =~ s/${([^ t=:+{}]+)}/substitute_ac_subst_variables_worker ($1)/ge;
The error is with automake and perl v5.26.

从perl v5.22开始:不再推荐在正则表达式中使用 {,而且如果没有转义 { 会有错误警告,从v5.26开始:不仅有警告,还会有语法错误提示。

解决方法:在所有正则表达式的 "{" "}" 前面增加转义符:

diff -r1.1008 awstats.pl
9045,9050c9045,9050
< $LogFormatString =~ s/"%{Referer}i"/%refererquot/g;
< $LogFormatString =~ s/"%{User-Agent}i"/%uaquot/g;
< $LogFormatString =~ s/%{mod_gzip_input_size}n/%gzipin/g;
< $LogFormatString =~ s/%{mod_gzip_output_size}n/%gzipout/g;
< $LogFormatString =~ s/%{mod_gzip_compression_ratio}n/%gzipratio/g;
< $LogFormatString =~ s/(%{ratio}n)/%deflateratio/g;
---
> $LogFormatString =~ s/"%{Referer}i"/%refererquot/g;
> $LogFormatString =~ s/"%{User-Agent}i"/%uaquot/g;
> $LogFormatString =~ s/%{mod_gzip_input_size}n/%gzipin/g;
> $LogFormatString =~ s/%{mod_gzip_output_size}n/%gzipout/g;
> $LogFormatString =~ s/%{mod_gzip_compression_ratio}n/%gzipratio/g;
> $LogFormatString =~ s/(%{ratio}n)/%deflateratio/g;


立即购买:淘宝网]]>
IT技术精华 2019-08-15 20:08:05