This functionality was modified in an update. For more information, see Extract data from a document (modified in an update).
You can extract data from a document thereby updating all relationships between tags and the document.
To extract content from a document, use Web Client to perform the following steps:
To extract content from 3D models .zvf and .mdb2 files, we recommend you to install Microsoft Access database engine 2010 (64-bit) on the SmartPlant Foundation application server.
-
Click Documents > All Documents.
-
To extract data, select a document from All Documents list, and click Actions > Extract Content.
-
In the Extract Content window, do the following:
-
Select any file attached to the document from the Select File list.
-
Select a template group from the TemplateGroup/Template list to apply the processing rules using the corresponding Preprocessor Reader.
From Update 14 onwards, for PDF and drawing files, a template group which is set as default in the Data Capture Drawing Reader Pre-Processor and PDF Reader Pre-Processor is automatically selected and applied for extracting content. However, you can select and apply any template group from the TemplateGroup/Template list to process the content. For more information, see Manage drawing reader pre-processor templates and template groups and Manage PDF reader pre-processor templates.
-
-
Click OK.
To use preprocessed content files for processing the file, click Show more and apply more options as follows:
Click this |
To do this |
|
---|---|---|
Use Existing PreProcessed Content Files |
Process the file using the preprocessed content XML file available. To extract content using the preprocessed content files, we recommend you attach the ContentFile.xml along with the corresponding file to the document. You must also attach GraphicsMapFile.xml if the file type supports graphical navigation. |
|
Reader Pre-Processor |
Select appropriate Preprocessor Reader for processing the datasheet file. |
|
For Hexagon 3D model |
OleDB Provider box and type the connection string. |
Connect to the Microsoft Access database. |
Match Tag Patterns check box. |
Extract the tags based on the tag patterns defined in the Tag Discovery Patterns module. |
-
Before preparing to work with large amount of data, based on the size of the data it is recommended to configure the LicenseTimeoutSeconds property under Site Settings node in SmartPlant Foundation Server Manager. This setting will prevent the license token to timeout, thereby allowing the session to retain.
-
To view the status of content extraction from a selected document:
-
Select Actions menu, and click Show the detail form > Extract Content.
For more information about the status of a document processed using the Data Capture, see Data Capture Document Status.
-
-
FDW tags are created without applying the ENS definition.
-
The master tag and the FDW tag are identified with the same icon . The alias tag is identified with the icon.
-
In the Desktop Client, the FDW tag is identified with the icon which is same for the master tag extracted using the Data Capture Content Discovery Task module.
-
You can select Match Tag Patterns to extract the tags based on the tag patterns defined in the Tag Discovery Patterns module. For few file types, Match Tag Patterns is pre-selected.
-
Except for the datasheet file, the Reader Pre-Processor is automatically selected based on the attached file type. For the datasheet file, you can select one of the following options as the base reader:
-
Datasheet Reader
-
PDF Reader
-
-
You can view the base reader set for different file types in the Data Capture Central Settings module in the Desktop Client. For more information, see Manage file types and prioritize them for content extraction.
-
For PDF files and Microsoft Office files, by default PDF reader is selected as the base reader in the Data Capture Central Settings module in the Desktop Client. For any file types other than the PDF files if the base reader is set as the PDF reader, when extracting content from such file types the PDF reader generates Markup renditions which are used by the software to retrieve the tags details. For more information, see Manage file types and prioritize them for content extraction.
-
After extracting data from the document, you can navigate to the document and tags in Web Client. For more information, see View and manage Data Capture data using the Web Client.