ZIP/UnZIP
Related pages:
ZIP/UnZIP Technical Notes
Introduction
The Zip/UnZip component allows users to compress and decompress files in their data integration flows. This connector can be used to reduce the size of data files and improve data transfer performance, as well as to simplify the handling of compressed files.
How it works
It allows users to make operations with ZIP file extensions. For this purpose, it has 2 actions [ZIP] and [UnZIP]. Please note some of the limitations of this component.
Environment variables
This component has non-mandatory environment variable:
-
ZIP_TTL
- it defaults to 60000 milliseconds(60sec), if other value is not defined. Number of millisecond of time to live for zip files, unZip action downloads zip files into filesystem.ZIP action launch scheduler that will deleted files older than allowed ttl(time to live). -
REQUEST_MAX_CONTENT_LENGTH
- default 1GB. Number of bytes, max content length for uploading attachments.
Triggers
This component has no trigger functions. This means it will not be accessible to select as a first component during the integration flow design.
Actions
ZIP action
ZIP provided files. Iterate over body files
, for each member if path
match
configured regex
download file from provided url
, and append them to zip using
provided path
as location and name of the file. Output contains attachment with
url to archive:
Configuration fields description
-
regex
- default match all'[^]*'
. Regex for filename with extension, only files that matches regex will be add to ZIP. -
httpTimeout
- default 60000 milliseconds(60 sec), number of milliseconds for http request timeouts -
httpRetry
- default 3, number of retry for http request -
zipName
- by default generate uuid name with.zip
extension. Output zip filename with extension.
Input and output schema description
Input schema
Contains array of items with properties:
-
url
- url to file, from where file can be downloaded -
path
- path under which file will be stored inside zip, regex perform validation only over filename with extension part of path
Output schema:
Contains property size in body, and url to created archive in attachments.
UnZIP action
Unzip provided zip file. Only files that match regex
and with uncompressed size less than maxFileSize
will be unzipped:
Configuration fields description
-
regex
- default match all'[^]*'
. Regex for filename with extension, only files that matches regex will be unzipped. -
maxFileSize
- default 104857600 bytes(100mb), number of bytes. Maximum file size, files with uncompressed size bigger than provided value will not be unzipped -
httpTimeout
- default 60000 milliseconds(60 sec), number of milliseconds for http request timeouts -
httpRetry
- default 3, number of retry for http request
Input and output schema description
Input schema:
Contains property url
that provided url to zip, that will be downloaded and unzipped.
Output schema
Contains array of items with properties:
filename
- name of file with extensionsize
- uncompressed file size
Use Case
In this section, we will look at an example of how the Zip/UnZip component can be used. Many CRMs use ZIP file extensions to compress and export their data. Imagine an organization that wants to use a new CRM system. Different systems use different file extensions. Their old CRM system supports only CSV file extensions while the new one only XML. Integration flow must use ZIP/UnZIP component to UnZIP archive and convert CSV to XML file. Then this file could be archived into ZIP and imported to a new CRM. Let’s start with what our flow should look like:
Flow view
In the first step, we start with the Webhook component. It accepts the Zip archive, which we will further process with the Zip/UnZip component.
In the second step, we will use a Zip/UnZip component function that gets the URL of the ZIP archive and perform the UnZIP action:
For a better understanding of the next steps, please take a look at the CSV table:
CSV Table
In the third step we use CSV component to read an archive:
The output of the CSV component in the JSON format will be sent to the next component:
CSV Sample output
{
"result": [
{
"column0": "Identifier",
"column1": "First name",
"column2": "Last name"
},
{
"column0": "901242",
"column1": "Rachel",
"column2": "Booker"
},
{
"column0": "207074",
"column1": "Laura",
"column2": "Grey"
},
{
"column0": "408129",
"column1": "Craig",
"column2": "Johnson"
},
{
"column0": "934600",
"column1": "Mary",
"column2": "Jenkins"
},
{
"column0": "507916",
"column1": "Jamie",
"column2": "Smith"
}
]
}
In step 4, we use an XML component that converts the JSON output from the CSV component into an XML file.
The last step is the ZIP component again that archives the XML output file into a ZIP:
Here you can see what the XML file we archived looks like:
XML output file
<?xml version="1.0" encoding="UTF-8"?>
<table>
<column0>Identifier</column0>
<column1>First name</column1>
<column2>Last name</column2>
<column0>901242</column0>
<column1>Rachel</column1>
<column2>Booker</column2>
<column0>207074</column0>
<column1>Laura</column1>
<column2>Grey</column2>
<column0>408129</column0>
<column1>Craig</column1>
<column2>Johnson</column2>
<column0>934600</column0>
<column1>Mary</column1>
<column2>Jenkins</column2>
<column0>507916</column0>
<column1>Jamie</column1>
<column2>Smith</column2>
</table>
After all the above steps, this archive is completely ready for use in the new CRM.
Limitations
-
Attachments mechanism does not work with the Local Agent Installation.
-
UnZIP action does not support archived folders. It can only unzip archived files.