Search results

Lossy Count Algorithm
...ency]] exceeds a user-given threshold. The algorithm works by dividing the data stream into buckets for frequent items, but fill as many buckets as possibl ...here data takes the form of a continuous data stream instead of a finite [[data set]], such as [[network traffic measurement]]s, [[web server]] logs, and [ ...

2 KB (299 words) - 04:52, 3 March 2023
Affinity propagation
{{Short description|Algorithm in data mining}} ...Frey |author2=Delbert Dueck |title=Clustering by passing messages between data points |journal=[[Science (journal)|Science]] |volume=315 |year=2007 |pages ...

6 KB (950 words) - 03:39, 8 May 2024
Triplet loss
{{Short description|Function for machine learning algorithms}}'''Triplet loss''' is a [[machine learning]] [[Loss functions for classifi ...iscrimination across varied conditions. In the context of face detection, data points correspond to images. ...

8 KB (1,249 words) - 21:02, 23 February 2025
Alpha algorithm
The '''α-algorithm''' or '''α-miner''' is an algorithm used in [[process mining]], aimed at reconstructing causality from a set of [[sequence of events|seq ...ring process models from event logs", ''IEEE Transactions on Knowledge and Data Engineering'', vol 16</ref> The goal of Alpha miner is to convert the event ...

8 KB (1,325 words) - 15:42, 8 January 2024
Inductive miner
...=5|pages=28–41|doi=10.1109/MCI.2009.935307|s2cid=14520822 }}</ref> Various algorithms proposed previously give process models of slightly different type from the ...Cite book|last=Leemans|first=S. J. J.|date=2017-05-09|title=Robust process mining with guarantees - SIKS Dissertation Series No. 2017-12|url=https://research ...

7 KB (1,196 words) - 20:56, 29 January 2025
String kernel
In [[machine learning]] and [[data mining]], a '''string kernel''' is a [[Positive-definite kernel|kernel function]] ...s|clustered]] or [[statistical classification|classified]], e.g. in [[text mining]] and [[bioinformatics|gene analysis]].<ref> ...

7 KB (984 words) - 16:58, 22 August 2023
Tensor decomposition
...ar algebra]], a '''tensor decomposition''' is any scheme for expressing a "data tensor" (M-way array) as a sequence of elementary operations acting on othe ...06-30 |title=Proceedings of the 2016 SIAM International Conference on Data Mining |chapter-url=https://epubs.siam.org/doi/10.1137/1.9781611974348.80 |languag ...

7 KB (965 words) - 09:41, 28 November 2024
Feature scaling
...so known as [[data normalization]] and is generally performed during the [[data preprocessing]] step. ...he range of values of raw data varies widely, in some [[machine learning]] algorithms, objective functions will not work properly without [[Normalization (statis ...

8 KB (1,131 words) - 02:18, 24 August 2024
Differentially private analysis of graphs
|journal=Kao MY. (Eds) Encyclopedia of Algorithms. Springer, Berlin, Heidelberg ...cs while preserving [[differential privacy]]. Such algorithms are used for data represented in the form of a graph where nodes correspond to individuals an ...

6 KB (857 words) - 05:03, 12 April 2024
Calinski–Harabasz index
...terion (VRC), is a metric for evaluating [[Clustering algorithm|clustering algorithms]], introduced by Tadeusz Caliński and Jerzy Harabasz in 1974.<ref name="cal Given a data set of ''n'' points: {'''x'''<sub>1</sub>, ..., '''x'''<sub>''n''</sub>}, a ...

7 KB (1,035 words) - 18:30, 30 July 2024
Kinetic priority queue
...e. Kinetic priority queues have been used as components of several kinetic data structures, as well as to solve some important non-kinetic problems such as ...ter or hanger, however it is less local and responsive than the heap-based data-structures. ...

6 KB (878 words) - 21:15, 2 February 2024
Local outlier factor
...oceedings of the 2000 ACM SIGMOD International Conference on Management of Data| series = [[SIGMOD]]| isbn = 1-58113-217-4| pages = 93–104| url = http://ww ...Outliers | doi = 10.1007/978-3-540-48247-5_28 | title = Principles of Data Mining and Knowledge Discovery | series = Lecture Notes in Computer Science | volu ...

13 KB (1,852 words) - 15:38, 19 February 2025
Flajolet–Martin algorithm
...ia.fr/flajolet/Publications/DuFl03-LNCS.pdf |accessdate=2016-12-11 |title= Algorithms - ESA 2003 |volume=2832 |pages=605 |series=Lecture Notes in Computer Scienc ...th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems of data - PODS '10 |page=41 |year=2010 |last1=Kane |first1=Daniel M. |last2=Nelson ...

8 KB (1,164 words) - 07:31, 22 February 2025
Conformance checking
...s.<ref name="Aalst2013">{{cite book|author=Wil van der Aalst|title=Process Mining: Discovery, Conformance and Enhancement of Business Processes|publisher=Spr | journal = IEEE Transactions on Knowledge and Data Engineering ...

11 KB (1,835 words) - 12:52, 29 January 2023
Contrast set learning
...ead a collection of data and collect information that is used to place new data into a series of discrete categories, contrast set learning takes the categ | title = Detecting group differences: Mining contrast sets ...

16 KB (2,419 words) - 22:00, 25 January 2024
K-means++
{{short description |Algorithm in data mining}} In [[data mining]], '''''k''-means++'''<ref>{{Cite conference ...

11 KB (1,566 words) - 03:37, 21 December 2024
Random surfing model
...ournal=Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms|publisher=Department of Mathematical Sciences, Carnegie Mellon University|p ...irst1=Mohammed J.|title=Data Mining and Analysis: Fundamental Concepts and Algorithms|last2=Meira, Jr.|first2=Wagner|publisher=Cambridge University Press|year=20 ...

7 KB (1,124 words) - 11:42, 8 May 2024
Wiener connector
=== Exact algorithms === === Approximation algorithms === ...

9 KB (1,325 words) - 20:58, 12 October 2024
Time-series segmentation
...pe=pdf#page=14 Segmenting time series: A survey and novel approach]." Data mining in time series databases 57 (2004): 1-22.</ref> Probabilistic methods based ==Segmentation algorithms== ...

5 KB (739 words) - 21:52, 12 June 2024
Growing self-organizing map
...ry based on a heuristic. By using the value called Spread Factor (SF), the data analyst has the ability to control the growth of the GSOM. ##Calculate the growth threshold (<math>GT</math>) for the given data set of dimension <math>D</math> according to the spread factor (<math>SF</m ...

7 KB (1,039 words) - 12:03, 27 July 2023

Search results

Navigation menu

Search