Wiki
We have compiled hundreds of related entries to help you understand "artificial intelligence"
The DQ-LoRe framework utilizes Dual Query (DQ) and Low-Rank Approximate Reranking (LoRe) to automatically select contextual learning examples.
Contrastive learning is a technique that enhances the performance of vision tasks by using the principle of contrasting samples against each other to learn properties that are common between data classes and properties that distinguish one class from another.
By addressing the limitations of traditional LSTM and incorporating novel components such as exponential gating, matrix memory, and parallelizable architecture, xLSTM opens up new possibilities for LLM.
A point cloud is a dataset of points in space that can represent a 3D shape or object, typically acquired by a 3D scanner.
Referring Image Segmentation (RIS) aims to segment the target object referred to by natural language expressions. However, previous methods rely on a strong assumption that a sentence must describe an object in an image.
The multiple drafts model is a physicalist theory of consciousness based on cognitivism, proposed by Daniel Dennett. The theory views the mind from the perspective of information processing. Dennett published Consciousness Explained in 1991.
KAN: Kolmogorov-Arnold Networks The paper proposes a promising alternative to Multilayer Perceptron (MLP) called Kolmogorov-Arnold Networks (KAN). The name KAN comes from […]
The Kolmogorov-Arnold representation theorem makes it easier to analyze complex dynamical systems
Action model learning encompasses a complex process within the field of artificial intelligence where models are developed essentially to predict the effects of an agent’s actions in an environment.
True Positive Rate (TPR) is a metric used in statistics, machine learning, and medical diagnosis to evaluate the performance of binary classification models. It represents the proportion of actual positive cases that are correctly identified or classified as positive by the model. TPR is also known as sensitivity, recall, or […]
Glitch tokens are words that, in large language models, are supposed to help the model run smoothly, but result in abnormal output. A research team from Huazhong University of Science and Technology, Nanyang Technological University, and other universities published a study in 2024 titled "Glitch Tokens in […]
Multimodal large language models combine the power of natural language processing (NLP) with other modalities such as images, audio, or video.
Compared with other LLM upgrading methods using mixture of experts, DUS does not require complex changes for efficient training and inference.
In the field of deep learning, Grokking refers to a phenomenon in the training process of neural networks, that is, good generalization can be achieved long after the training error decays.
Scaling laws in deep learning refer to the relationship between a functional property of interest (usually a test loss or some performance metric on a fine-tuning task) and properties of the architecture or optimization procedure (such as model size, width, or training compute).
Emergence in the field of artificial intelligence refers to a phenomenon in which complex collective behaviors or structures emerge through the interaction of simple individuals or rules. In artificial intelligence, this emergence can refer to high-level features or behaviors learned by the model that are not directly designed […]
Explainable AI (XAI) is a set of processes and methods that allow human users to understand and trust the results and outputs created by machine learning algorithms.
Conditional computation is a technique to reduce the total amount of computation by performing computation only when it is needed.
Statistical Classification is a supervised learning method used to classify new observations into one of the known categories.
Variational Autoencoder (VAE) is an artificial neural network structure proposed by Diederik P. Kingma and Max Welling, belonging to the probabilistic graphical model and variational Bayesian method.
Masked Language Modeling (MLM) is a deep learning technique widely used in natural language processing (NLP) tasks, especially in the training of Transformer models such as BERT, GPT-2, and RoBERTa.
Knowledge Engineering is a branch of Artificial Intelligence (AI) that develops rules and applies them to data to mimic the thought processes of a person with expertise on a particular subject.
Inception Score (IS) is an objective performance metric used to evaluate the quality of generated or synthetic images produced by a generative adversarial network (GAN).
Fuzzy Logic is a variable processing method that allows multiple possible truth values to be processed by the same variable. Fuzzy logic attempts to solve problems through an open, imprecise data spectrum and heuristic methods to obtain a series of accurate conclusions.
The DQ-LoRe framework utilizes Dual Query (DQ) and Low-Rank Approximate Reranking (LoRe) to automatically select contextual learning examples.
Contrastive learning is a technique that enhances the performance of vision tasks by using the principle of contrasting samples against each other to learn properties that are common between data classes and properties that distinguish one class from another.
By addressing the limitations of traditional LSTM and incorporating novel components such as exponential gating, matrix memory, and parallelizable architecture, xLSTM opens up new possibilities for LLM.
A point cloud is a dataset of points in space that can represent a 3D shape or object, typically acquired by a 3D scanner.
Referring Image Segmentation (RIS) aims to segment the target object referred to by natural language expressions. However, previous methods rely on a strong assumption that a sentence must describe an object in an image.
The multiple drafts model is a physicalist theory of consciousness based on cognitivism, proposed by Daniel Dennett. The theory views the mind from the perspective of information processing. Dennett published Consciousness Explained in 1991.
KAN: Kolmogorov-Arnold Networks The paper proposes a promising alternative to Multilayer Perceptron (MLP) called Kolmogorov-Arnold Networks (KAN). The name KAN comes from […]
The Kolmogorov-Arnold representation theorem makes it easier to analyze complex dynamical systems
Action model learning encompasses a complex process within the field of artificial intelligence where models are developed essentially to predict the effects of an agent’s actions in an environment.
True Positive Rate (TPR) is a metric used in statistics, machine learning, and medical diagnosis to evaluate the performance of binary classification models. It represents the proportion of actual positive cases that are correctly identified or classified as positive by the model. TPR is also known as sensitivity, recall, or […]
Glitch tokens are words that, in large language models, are supposed to help the model run smoothly, but result in abnormal output. A research team from Huazhong University of Science and Technology, Nanyang Technological University, and other universities published a study in 2024 titled "Glitch Tokens in […]
Multimodal large language models combine the power of natural language processing (NLP) with other modalities such as images, audio, or video.
Compared with other LLM upgrading methods using mixture of experts, DUS does not require complex changes for efficient training and inference.
In the field of deep learning, Grokking refers to a phenomenon in the training process of neural networks, that is, good generalization can be achieved long after the training error decays.
Scaling laws in deep learning refer to the relationship between a functional property of interest (usually a test loss or some performance metric on a fine-tuning task) and properties of the architecture or optimization procedure (such as model size, width, or training compute).
Emergence in the field of artificial intelligence refers to a phenomenon in which complex collective behaviors or structures emerge through the interaction of simple individuals or rules. In artificial intelligence, this emergence can refer to high-level features or behaviors learned by the model that are not directly designed […]
Explainable AI (XAI) is a set of processes and methods that allow human users to understand and trust the results and outputs created by machine learning algorithms.
Conditional computation is a technique to reduce the total amount of computation by performing computation only when it is needed.
Statistical Classification is a supervised learning method used to classify new observations into one of the known categories.
Variational Autoencoder (VAE) is an artificial neural network structure proposed by Diederik P. Kingma and Max Welling, belonging to the probabilistic graphical model and variational Bayesian method.
Masked Language Modeling (MLM) is a deep learning technique widely used in natural language processing (NLP) tasks, especially in the training of Transformer models such as BERT, GPT-2, and RoBERTa.
Knowledge Engineering is a branch of Artificial Intelligence (AI) that develops rules and applies them to data to mimic the thought processes of a person with expertise on a particular subject.
Inception Score (IS) is an objective performance metric used to evaluate the quality of generated or synthetic images produced by a generative adversarial network (GAN).
Fuzzy Logic is a variable processing method that allows multiple possible truth values to be processed by the same variable. Fuzzy logic attempts to solve problems through an open, imprecise data spectrum and heuristic methods to obtain a series of accurate conclusions.