Archive for : July, 2020

Data Mining 数据挖掘


An important concept of data science is data mining. You could compare the process that an institute adopt data science analysis for decision making as the process of diamond mining and processing, where you explore the rough diamonds and polish them via multiple complex processes. Data mining is such a systematic process, where it involves the application of information technology such as the automated discovery and evaluation of patterns from data and it may requires an analyst’s creativity, business knowledge and commen sense.

每一个商业决策问题都是独特的,都有它特定的目标,限制和特征。我们面对一个商业问题,要解决它也可以运用工程学思维,把一个商业问题解构成 子任务集。这些子任务中,有些是特殊的商业问题,还有一些是普遍的数据挖掘任务。需要注意的是,在数据科学中一个很重要的技能就是将数据分析问题分解成子问题然后根据各个子问题来找相应的解决方案。 所以在学习数据挖掘的具体流程之前,我们要谈一谈数据挖掘的几种常见的任务类型以便之后对数据挖掘的整个过程以及其概念的具化了解。

Every business decision making problem is unique, comprising its own combinations of goals, constraints and characteristics. We could proceed a business problem by adopting engineering approach, that is, to decompose a business problem into subtasks. Among these subtasks some are unique to business problems and some are common data mining tasks. Note that a critical skill in data science is the ability to breakdown a data analytics problem into parts such that each part matches a known task for which the solutions are available. So before learning about the data mining process, it is useful to discuss about the common types of data mining tasks, which allow us to be more concrete when we are presented the data mining processes and concepts.