本文共 8790 字,大约阅读时间需要 29 分钟。
ssis zip压缩文件
In this article, we will give an overview of using Flat Files and Raw Files in SSIS, then we will illustrate some of the differences between using these two types.
在本文中,我们将概述在SSIS中使用平面文件和原始文件,然后说明使用这两种类型之间的一些区别。
This is the ninth article in the SSIS features face-to-face series which aims to remove confusion around similar features in SQL Server Integration Services.
这是SSIS功能面对面系列文章中的第九篇,旨在消除对SQL Server Integration Services中类似功能的困惑。
In general, a flat file doesn’t contain internal hierarchy; it may contain text, log entries or data in tabular form. More specifically, SSIS Flat Files are text files that store tabular data and are manipulated line-by-line.
通常,平面文件不包含内部层次结构。 它可能包含文本,日志条目或表格形式的数据。 更具体地说,SSIS平面文件是存储表格数据并逐行处理的文本文件。
To handle Flat Files, you must create an SSIS Flat File connection manager, where you define the Flat File metadata. There are different formats for Flat Files that can be handled by the SSIS Flat File connection manager, as follows:
若要处理平面文件,必须创建一个SSIS平面文件连接管理器,在其中定义平面文件元数据。 SSIS平面文件连接管理器可以处理平面文件的不同格式,如下所示:
If you open the SSIS Flat File connection manager editor, it contains four tabs:
如果打开SSIS平面文件连接管理器编辑器,它将包含四个选项卡:
Figure 1 – SSIS Flat File connection manager
图1 – SSIS平面文件连接管理器
In a Flat File, data are stored as text. When values with different data types are stored in a Flat File, you can implicitly convert them from the connection manager or the source component, or explicitly convert them using data conversion and derived column transformations. For more information, you can refer to the following article: .
在平面文件中,数据存储为文本。 当具有不同数据类型的值存储在平面文件中时,可以从连接管理器或源组件隐式转换它们,或者使用数据转换和派生的列转换显式转换它们。 有关更多信息,您可以参考以下文章: 。
Figure 2 – SSIS Flat File source description from toolbox
图2 –工具箱中的SSIS平面文件源描述
To import or export data from Flat Files, you must use SSIS Flat File Source and SSIS Flat File Destination components within a Data Flow Task. Note, that if you are handling a Flat File that contains non-tabular data, you may need to read the Flat File using a script (task or component) and implement complex logic, or you may need to read each row as one column (length = 4000) and use a transformation to consume the file content. You can check the following links for examples of importing complex flat files in SSIS:
若要从平面文件导入或导出数据,必须在数据流任务中使用SSIS 平面文件源和SSIS 平面文件目标组件。 请注意,如果您要处理包含非表格数据的平面文件,则可能需要使用脚本(任务或组件)读取平面文件并实施复杂的逻辑,或者可能需要将每一行读为一列(长度= 4000)并使用转换来使用文件内容。 您可以检查以下链接,以获取在SSIS中导入复杂平面文件的示例:
Figure 3 – SSIS Flat File destination description from toolbox
图3 –工具箱中的SSIS平面文件目标描述
Raw Files are a kind of SSIS Flat File used to dump data between different ETL stages. The data is stored in binary format and can only be used by the SSIS Raw Files component.
原始文件是一种SSIS平面文件,用于在不同ETL阶段之间转储数据。 数据以二进制格式存储,并且只能由SSIS Raw Files组件使用。
To use Raw Files in SSIS, you don’t have to create a connection manager, since it can be defined within the Raw File source and Raw File destination components:
要在SSIS中使用原始文件,无需创建连接管理器,因为可以在“原始文件”源和“原始文件”目标组件中定义它:
Figure 4 – SSIS Raw File destination
图4 – SSIS原始文件目的地
To create a Raw File, just add a Raw File destination in a data flow task. When you open the Raw File destination editor, there are two tabs:
要创建原始文件,只需在数据流任务中添加原始文件目标。 打开原始文件目标编辑器时,有两个选项卡:
Figure 5 – SSIS Raw File destination description from toolbox
图5 –工具箱中的SSIS Raw File目标描述
In the connection manager tab, specify the file name (directly or from a variable), and choose the write mode option:
在连接管理器选项卡中,指定文件名(直接或通过变量),然后选择写入模式选项:
In the columns tab, select the columns you want to dump into the SSIS Raw File destination:
在“列”选项卡中,选择要转储到SSIS Raw File目标中的列:
Figure 6 – SSIS Raw File source description from toolbox
图6 –工具箱中的SSIS原始文件源描述
After dumping data into a raw file, you must use a Raw File Source to read this data. This component is very similar to the destination component, except that there is no Write mode option:
将数据转储到原始文件后,必须使用原始文件源读取此数据。 该组件与目标组件非常相似,不同之处在于没有写模式选项:
Figure 7 – SSIS Raw File source
图7 – SSIS原始文件源
Note that the SSIS Raw File source can only be used to read a file created using a Raw File destination.
请注意,SSIS Raw File源只能用于读取使用Raw File目标创建的文件。
To read more about Raw Files, refer to the following official documentation:
要阅读有关原始文件的更多信息,请参阅以下官方文档:
Now I will illustrate the difference between both file types in SSIS.
现在,我将说明SSIS中两种文件类型之间的区别。
SSIS Flat Files are widely used to dump data from relational databases to be used later for other purposes, but most people don’t know that they are not recommended from a performance perspective. Even though comma-separated values files (.csv) are one of the most popular data sources used, Raw Files are designed to deliver higher performance when transferring data.
SSIS平面文件被广泛用于从关系数据库中转储数据,以供以后用于其他目的,但是大多数人都不知道从性能角度来看不建议使用它们。 即使逗号分隔值文件(.csv)是使用的最受欢迎的数据源之一,原始文件仍被设计为在传输数据时提供更高的性能。
SSIS Flat Files require parsing and validation, while the data in Raw Files are stored in native format and requires no translation and little parsing. In 2009, an experiment was conducted by John Welch to illustrate the difference between SSIS Flat Files and Raw Files from a performance perspective. You can read this article for more details:
SSIS平面文件需要解析和验证,而原始文件中的数据以本机格式存储,并且不需要转换和解析。 2009年,约翰·韦尔奇(John Welch)进行了一项实验,从性能的角度说明了SSIS平面文件和原始文件之间的区别。 您可以阅读本文以了解更多详细信息:
Raw Files are very useful for implementing parallel data import logic since you can split a file over multiple Raw Files then import them in parallel.
原始文件对于实现并行数据导入逻辑非常有用,因为您可以将一个文件拆分为多个原始文件,然后并行导入它们。
On the other hand, Raw Files cannot be edited or consumed outside of SSIS, which makes them only usable for data staging purposes.
另一方面,原始文件不能在SSIS外部进行编辑或使用,这使得它们仅可用于数据登台目的。
In conclusion, if you need to export data into a file for use in other systems or to be published, you will be best served with a data format that is widely used like SSIS Flat Files. But, if you need to dump data for use in a different ETL stage, Raw Files are recommended.
总之,如果您需要将数据导出到文件中以供其他系统使用或发布,则最好使用像SSIS Flat Files这样广泛使用的数据格式。 但是,如果您需要转储数据以用于其他ETL阶段,则建议使用Raw Files。
SSIS Flat Files vs Raw Files |
SSIS平面文件与原始文件 |
翻译自:
ssis zip压缩文件
转载地址:http://pynwd.baihongyu.com/