In the model,an improved edit distance-based algorithm is proposed to match the strings;attributes matching graph is constructed and twice verification strategy is adopted to identify duplicate records.
这给识别重复记录带来了很大不便,导致传统的去重算法无法达到很好的效果。
This paper studied the problem of detecting approximately duplicate records while receiving increments of data with no changes in data schema and matching rule set, and presented an incremental algorithm IACT (Incremental Algorithms based on Clustering Trees for data cleansing).
研究了在数据模式与匹配规则不变的前提下 ,数据集动态增加时近似重复记录的识别问题 ,提出了一种基于聚类树的增量式数据清洗算法IACT 。
Based on this idea, we study the problem for detecting approximately duplicate records while receiving increments of data with no changes in data schema and matching model, and present an incremental algorithm for detecting the records.
介绍了优先队列方法(PriorityQueueStrategy,PQS),并以此为基础,研究了在数据模式与匹配模型不变的前提下,数据源动态增加时近似重复记录识别问题,提出了一种增量式算法IPQS(IncrementalPQS),最后给出了实验结果。
Some measures should be taken to lessen them.
应该采取措施以减少遗漏或重复记录。
CURE Algorithm-based Inspection of Duplicated Records
基于CURE算法的相似重复记录检测
Research on DBSCAN-Based Detection Method of Approximate Duplicate Records;
基于DBSCAN算法的相似重复记录检测方法研究
Duplicate Record Detection Method Based on Optimal Bipartite Graph Matching
一种基于二分图最优匹配的重复记录检测算法
A Model of Identifying Duplicate Records for Deep Web Environment
一种应用于Deep Web环境下的重复记录识别模型
Research on optimal feature selection method for approximately duplicate records detecting
基于相似重复记录检测的特征优选方法研究
A FUZZY MATCHING METHOD TO IDENTIFY APPROXIMATELY DUPLICATE RECORDS
一种识别相似重复记录的模糊匹配方法
Research on Eliminating Duplicate Records Based on Chinese Character Code
基于汉字机内编码的中文相似重复记录消除研究
Research on the Mechanism of Redo Log Record Locating in SQL Server Instance Recovery
SQL Server实例恢复中重做日志记录定位机制研究
Accounting is based on a double-entry system, which means that we record the dual effects of a business transaction.
会计以复式记帐为基础,即记录每笔交易的双重影响。
DVD-RW DVD Rewritable
可重复刻录DVD盘
a reproduction of a written record (e.g. of a legal or school record).
书面记录的复制品(如法律或学校记录)。
a tape recorder that records and reproduces dictation.
能记录复制电话录音的录音机。
For example, if you use a capture/ playback tool to add a record, a reply of the script will get a" duplicate record on file" error.
例如,如果你使用回归测试工具来添加一个记录,脚本文件将得到“重复的记录”这样的错误。
Recording estimation of thenar motor units and its reproducibility by fully automated incremental stimulating
自动递增刺激记录大鱼际肌运动单位估数及其可重复性
This is the default repeating delimiter for the entire schema unless specified elsewhere on a per-record basis.
这是适用于整个架构的默认重复分隔符,除非为每个记录另行指定。
thus, student names and addresses must be redundantly recorded on both files.
因此学生的姓名和地址必须重复地记录在两个文件中。
The equilibrium, which the bookkeeping record achieves through the accounting equation, is an essential feature of double entry.
薄记记录通过运用会计等式所达到的平衡关系是复式记帐的一个重要特点。