This functionality was modified in an update. For more information, see Create a new content discovery task (modified in an update).
Before preparing to work with large amount of data, based on the size of the data it is recommended to configure the LicenseTimeoutSeconds property under Site Settings node in HxGN SDx Server Manager. This setting will prevent the license token to timeout, thereby allowing the session to retain. If you are using SDx deployed in Smart Cloud, this must be done for you by the Smart Cloud Team.
-
On the Content Discovery Task page, click Create Content Discovery Tasks .
-
Select the Document Criteria filter and Document Reader filter to process the documents that match the selected criteria.
-
Select one or more file types for the selected document reader filter.
-
Type search text in the Document Name Pattern box to process the documents that match the selected criteria.
If you want to schedule the content discovery task to process at a later date, click Tasks Start Date .
-
Click OK to view the list of document that will be processed. You can filter the documents for processing in this window.
-
By default, the property Is Data Capture Rel is set to True on document to tag relationships SPFNDocRevMasterTag, SPFNDocRevAliasTag, FDWDocRevTag and SPFNFDWDocRevChildTag for Data Capture tags.
-
By default, the tags extracted by the content discovery task are associated with an Unknown tag classification and an Unclassified security code.
-
FDW tags are created without applying the ENS definition.
-
To extract tags from the drawing and pdf files, the software applies the templates and rules from the template group which is set as default. For more information, see Manage drawing reader pre-processor templates and template groups and Manage PDF reader pre-processor templates.
-
After the content discovery task processes the documents those have a reader as a base reader, the reader gets changes to the Image or Document reader. If you have to process these documents by the content discovery task, you must specify the reader as Image or Document without specifying the actual file type.
-
When a content discovery task fails, large file sets are re-processed in smaller and smaller batches to find the problem. For example, documents are re-processed in batches of 100 then 10, drawings in batches of 20 then 2. For each batch, a child content discovery task is created under the master content discovery task.
-
Data Capture creates the relationship object SPFNCDTFailedCDT between a master content discovery task and a child content discovery task.
-
To check the status of a content discovery task for failed documents in the Desktop Client:
-
Click Find > Data Capture Items > Content Discovery Tasks.
-
Right-click a content discovery task, and click Show CDT for Failed Docs in the shortcut menu.
To check the status of a master content discovery task, right-click a content discovery task, and click Show Root CDT in the shortcut menu.
-
-
You can select a content discovery task and click Rerun Content Discovery Task to rerun a content discovery task and process all the documents attached to it.
-
You can select a content discovery task and click Rerun Content Discovery Task for selected documents to process the selected document.
-
If the Content File and the GraphicsMap file are available in \\PreProcessedAlternateRenditions\PrepProcessedContentFiles folder and the \\PreProcessedContentFiles folder, then the content discovery task looks for the Content Files in \\PreProcessedAlternateRenditions\PrepProcessedContentFiles folder.
-
While extracting content from documents of drawing files and 3D models, a drawing representation object is created for each tag based on the Graphic OID property value in GraphicsMapFile.xml. The GraphicsMapFile.xml has the information for graphical navigation such as the corresponding InterfaceDefs, as well as all tag UIDs and Graphic OIDs for the document. The drawing representation object is related to the respective document and tag.
-
The drawing representation objects are specifically used for graphical navigation in the Web Client.
-
You cannot process the transferred documents and FDW documents using content discovery task.
-
When multiple units with the same name are related to multiple areas, in such scenario after content extraction the tag is related to the unit based on the tag's relationship to the area.
Process .sha files
-
In the Central Data Capture Settings module, map the .sha file type to Image Reader in the File Type page.
-
In the Data Capture Pre-Processor Utilities module, process the .sha file using the Drawing Reader Pre-Processor, and generate the content file.
-
In the Data Capture Task Manager module, process the content file with Content Discovery Task.
View document relations
On the Progress tab, click View document relations to view the relationship details of the document in the Relationship Details tab as shown in the following example:
1 |
The central node represents the master object on which the view related items service is based. |
2 |
The Columset Properties pane displays the properties based on the column set configured for the master object. |
3 |
The terminal nodes represent the related objects. |
4 |
Represents the number of the related objects. You can click the hyperlink to view the properties of the object or click View Related Items to view the object related items. |
5 |
The Additional Properties pane displays the properties configured in the Property Lists module. |
-
The terminal nodes displayed are based on the EdgeDefs configured on the view related items client API method. The EdgeDefs can be configured as a parameter (Arg1) for the view related items client API method in the Desktop Client.
-
The View Related Items client API method must be related to the interfaces realized by the selected document. If this method is not related to any of the selected document interfaces, then the terminal nodes in the diagram represent objects expanded from the relationship and user-defined edge definitions related to the master object.
-
If a property created in either the Tag Naming System or in the Property Lists module has a relationship configured against it, that relationship is created during the content discovery task.
-
You can click the View error log hyperlink of the content discovery task in the Summary tab to view the Error Log in the Content Discovery Task.
-
You can click the View Information Log hyperlink of the content discovery task in the Summary tab to view the log information of files in the Content Discovery Task module.