Wiki
We have compiled hundreds of related entries to help you understand "artificial intelligence"
In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.
In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]
For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]
BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]
In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.
Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]
Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]
Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.
Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]
In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]
Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: where the left side E represents the integrated generalization error, and the right side $latex {\over […]
MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.
Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]
Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]
Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]
Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]
Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.
Hidden layers refer to the layers other than the input layer and the output layer in a multi-level feedforward neural network. Hidden layers do not directly receive external signals or send signals to the outside world. They are only needed when data is separated nonlinearly. Neurons on hidden layers can take many forms, such as maximum pooling layers and convolutional layers, which perform different mathematical functions. […]
Hard voting is a voting method that directly outputs class labels, which mainly exists in classification machine learning algorithms. Voting is a combination strategy for classification problems in ensemble learning. Its basic idea is to select the class with the most output in the algorithm. Hard voting is to select the label with the most output by the algorithm. If the number of labels is equal, they are sorted in ascending order. […]
Independent and identically distributed (IID) means that the probability distribution of each variable in a set of random variables is the same, and these random variables are independent of each other. A set of random variables is independent and identically distributed does not mean that the probability of each event in their sample space is the same. For example, the sequence of results obtained by throwing uneven dice is independent and identically distributed, but the probability of each [...]
Incremental learning means that when new data is added, only the new data is updated. Incremental learning can continuously learn new knowledge from new samples while preserving most of the previously learned knowledge. Incremental learning is similar to the human learning model, which is a process of gradual accumulation and updating. The traditional learning method is batch learning, which prepares all the data[…]
A knowledge base is a special database that is used for knowledge management to facilitate the collection, organization and extraction of knowledge in related fields. The knowledge in the database comes from domain experts. It is a collection of knowledge in related fields for solving problems, covering basic facts, rules and other relevant information. A knowledge base is a structured, easy-to-operate, easy-to-use, and comprehensive knowledge base in knowledge engineering.
K-nearest neighbor algorithm KNN is a basic classification and regression algorithm that uses the K points closest to itself to vote to determine the classification of the classification data. KNN characteristics KNN is lazy learning KNN has high computational complexity Different K values will result in different classification results
JS divergence measures the similarity of two probability distributions. It is a variant of KL divergence and solves the asymmetric problem of KL divergence. Generally, JS divergence is symmetric and its value is between 0 and 1. It is defined as follows: There is a problem when measuring KL divergence and JS divergence: If two […]
In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.
In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]
For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]
BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]
In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.
Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]
Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]
Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.
Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]
In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]
Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: where the left side E represents the integrated generalization error, and the right side $latex {\over […]
MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.
Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]
Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]
Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]
Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]
Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.
Hidden layers refer to the layers other than the input layer and the output layer in a multi-level feedforward neural network. Hidden layers do not directly receive external signals or send signals to the outside world. They are only needed when data is separated nonlinearly. Neurons on hidden layers can take many forms, such as maximum pooling layers and convolutional layers, which perform different mathematical functions. […]
Hard voting is a voting method that directly outputs class labels, which mainly exists in classification machine learning algorithms. Voting is a combination strategy for classification problems in ensemble learning. Its basic idea is to select the class with the most output in the algorithm. Hard voting is to select the label with the most output by the algorithm. If the number of labels is equal, they are sorted in ascending order. […]
Independent and identically distributed (IID) means that the probability distribution of each variable in a set of random variables is the same, and these random variables are independent of each other. A set of random variables is independent and identically distributed does not mean that the probability of each event in their sample space is the same. For example, the sequence of results obtained by throwing uneven dice is independent and identically distributed, but the probability of each [...]
Incremental learning means that when new data is added, only the new data is updated. Incremental learning can continuously learn new knowledge from new samples while preserving most of the previously learned knowledge. Incremental learning is similar to the human learning model, which is a process of gradual accumulation and updating. The traditional learning method is batch learning, which prepares all the data[…]
A knowledge base is a special database that is used for knowledge management to facilitate the collection, organization and extraction of knowledge in related fields. The knowledge in the database comes from domain experts. It is a collection of knowledge in related fields for solving problems, covering basic facts, rules and other relevant information. A knowledge base is a structured, easy-to-operate, easy-to-use, and comprehensive knowledge base in knowledge engineering.
K-nearest neighbor algorithm KNN is a basic classification and regression algorithm that uses the K points closest to itself to vote to determine the classification of the classification data. KNN characteristics KNN is lazy learning KNN has high computational complexity Different K values will result in different classification results
JS divergence measures the similarity of two probability distributions. It is a variant of KL divergence and solves the asymmetric problem of KL divergence. Generally, JS divergence is symmetric and its value is between 0 and 1. It is defined as follows: There is a problem when measuring KL divergence and JS divergence: If two […]