public class POIFSContainerDetector extends Object implements org.apache.tika.detect.Detector
| Modifier and Type | Field and Description |
|---|---|
static org.apache.tika.mime.MediaType |
COMP_OBJ
Some other kind of embedded document, in a CompObj container within another OLE2 document
|
static org.apache.tika.mime.MediaType |
DGN_8 |
static org.apache.tika.mime.MediaType |
DOC
Microsoft Word
|
static org.apache.tika.mime.MediaType |
DRM_ENCRYPTED
TIKA-3666 MSOffice or other file encrypted with DRM in an OLE container
|
static org.apache.tika.mime.MediaType |
ESRI_LAYER |
static org.apache.tika.mime.MediaType |
GENERAL_EMBEDDED
General embedded document type within an OLE2 container
|
static org.apache.tika.mime.MediaType |
MPP
Microsoft Project
|
static org.apache.tika.mime.MediaType |
MS_EQUATION
Equation embedded in Office docs
|
static org.apache.tika.mime.MediaType |
MS_GRAPH_CHART
Graph/Charts embedded in PowerPoint and Excel
|
static org.apache.tika.mime.MediaType |
MSG
Microsoft Outlook
|
static String |
OCX_NAME |
static org.apache.tika.mime.MediaType |
OLE
The OLE base file format
|
static org.apache.tika.mime.MediaType |
OLE10_NATIVE
An OLE10 Native embedded document within another OLE2 document
|
static org.apache.tika.mime.MediaType |
OOXML_PROTECTED
The protected OOXML base file format
|
static org.apache.tika.mime.MediaType |
PPT
Microsoft PowerPoint
|
static org.apache.tika.mime.MediaType |
PUB
Microsoft Publisher
|
static org.apache.tika.mime.MediaType |
SDA
StarOffice Draw
|
static org.apache.tika.mime.MediaType |
SDC
StarOffice Calc
|
static org.apache.tika.mime.MediaType |
SDD
StarOffice Impress
|
static org.apache.tika.mime.MediaType |
SDW
StarOffice Writer
|
static org.apache.tika.mime.MediaType |
SLDWORKS
SolidWorks CAD file
|
static org.apache.tika.mime.MediaType |
VSD
Microsoft Visio
|
static org.apache.tika.mime.MediaType |
WPS
Microsoft Works
|
static org.apache.tika.mime.MediaType |
XLR
Microsoft Works Spreadsheet 7.0
|
static org.apache.tika.mime.MediaType |
XLS
Microsoft Excel
|
| Constructor and Description |
|---|
POIFSContainerDetector() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.tika.mime.MediaType |
detect(InputStream input,
org.apache.tika.metadata.Metadata metadata) |
static org.apache.tika.mime.MediaType |
detect(Set<String> names)
Deprecated.
Use
detect(Set, DirectoryEntry) and pass the root
entry of the filesystem whose type is to be detected, as a
second argument. |
static org.apache.tika.mime.MediaType |
detect(Set<String> anyCaseNames,
org.apache.poi.poifs.filesystem.DirectoryEntry root)
Internal detection of the specific kind of OLE2 document, based on the
names of the top-level streams within the file.
|
void |
setMarkLimit(int markLimit)
If a TikaInputStream is passed in to
detect(InputStream, Metadata),
and there is not an underlying file, this detector will spool up to markLimit
to disk. |
public static final org.apache.tika.mime.MediaType OLE
public static final org.apache.tika.mime.MediaType OOXML_PROTECTED
public static final org.apache.tika.mime.MediaType DRM_ENCRYPTED
public static final org.apache.tika.mime.MediaType GENERAL_EMBEDDED
public static final org.apache.tika.mime.MediaType OLE10_NATIVE
public static final org.apache.tika.mime.MediaType COMP_OBJ
public static final org.apache.tika.mime.MediaType MS_GRAPH_CHART
public static final org.apache.tika.mime.MediaType MS_EQUATION
public static final String OCX_NAME
public static final org.apache.tika.mime.MediaType XLS
public static final org.apache.tika.mime.MediaType DOC
public static final org.apache.tika.mime.MediaType PPT
public static final org.apache.tika.mime.MediaType PUB
public static final org.apache.tika.mime.MediaType VSD
public static final org.apache.tika.mime.MediaType WPS
public static final org.apache.tika.mime.MediaType XLR
public static final org.apache.tika.mime.MediaType MSG
public static final org.apache.tika.mime.MediaType MPP
public static final org.apache.tika.mime.MediaType SDC
public static final org.apache.tika.mime.MediaType SDA
public static final org.apache.tika.mime.MediaType SDD
public static final org.apache.tika.mime.MediaType SDW
public static final org.apache.tika.mime.MediaType SLDWORKS
public static final org.apache.tika.mime.MediaType ESRI_LAYER
public static final org.apache.tika.mime.MediaType DGN_8
public static org.apache.tika.mime.MediaType detect(Set<String> names)
detect(Set, DirectoryEntry) and pass the root
entry of the filesystem whose type is to be detected, as a
second argument.public static org.apache.tika.mime.MediaType detect(Set<String> anyCaseNames, org.apache.poi.poifs.filesystem.DirectoryEntry root)
DirectoryEntry of that file
for best results. The entry can be given as a second, optional argument.
Following
2.6.1 of MS-CFB ,
The detection is performed on case insensitive entry names.anyCaseNames - root - public void setMarkLimit(int markLimit)
detect(InputStream, Metadata),
and there is not an underlying file, this detector will spool up to markLimit
to disk. If the stream was read in entirety (e.g. the spooled file is not truncated),
this detector will open the file with POI and perform detection.
If the spooled file is truncated, the detector will return OLE (or
MediaType.OCTET_STREAM if there's no OLE header).
As of Tika 1.21, this detector respects the legacy behavior of not performing detection on a non-TikaInputStream.
markLimit - public org.apache.tika.mime.MediaType detect(InputStream input, org.apache.tika.metadata.Metadata metadata) throws IOException
detect in interface org.apache.tika.detect.DetectorIOExceptionCopyright © 2007–2024 The Apache Software Foundation. All rights reserved.