Documentation
¶
Overview ¶
Package magicnumber contains the magic number matchers for identifying file types that are expected to be handled by the Defacto2 server application. Magic numbers are not always accurate and should be used as hints combined with other checks such as file extension matching.
Usually, the magic number is the first few bytes of a file that uniquely identify the file type. But a number of document formats also check the final few bytes of a file.
At a later stage, the magic number matchers will be used to extract metadata from files and support for module tracking music files will be added.
The sources for the magic numbers byte values are from the following:
Index ¶
- func ASCII(p []byte) bool
- func Ansi(p []byte) bool
- func Arc(p []byte) bool
- func ArcArk(p []byte) bool
- func Arj(p []byte) bool
- func Avi(p []byte) bool
- func Avif(p []byte) bool
- func Bmp(p []byte) bool
- func Bzip2(p []byte) bool
- func Cab(p []byte) bool
- func Daa(p []byte) bool
- func DosKWAJ(p []byte) bool
- func DosSZDD(p []byte) bool
- func Flac(p []byte) bool
- func Flv(p []byte) bool
- func Gif(p []byte) bool
- func Gzip(p []byte) bool
- func Hlp(p []byte) bool
- func ISO(p []byte) bool
- func Ico(p []byte) bool
- func Iff(p []byte) bool
- func Ilbm(p []byte) bool
- func Ivr(p []byte) bool
- func Jar(p []byte) bool
- func Jpeg(p []byte) bool
- func Jpeg2000(p []byte) bool
- func JpegNoSuffix(p []byte) bool
- func LzhLha(p []byte) bool
- func M4v(p []byte) bool
- func MSComp(p []byte) bool
- func MSExe(p []byte) bool
- func Mdf(p []byte) bool
- func Midi(p []byte) bool
- func Mp3(p []byte) bool
- func Mp4(p []byte) bool
- func Mpeg(p []byte) bool
- func NonISO88951(b byte) bool
- func NonWindows1252(b byte) bool
- func NotASCII(b byte) bool
- func NotPlainText(b byte) bool
- func Nri(p []byte) bool
- func Ogg(p []byte) bool
- func Pcx(p []byte) bool
- func Pdf(p []byte) bool
- func PdfNoSuffix(p []byte) bool
- func Pklite(p []byte) bool
- func Pksfx(p []byte) bool
- func Pkzip(p []byte) bool
- func PkzipMulti(p []byte) bool
- func Png(p []byte) bool
- func QTMov(p []byte) bool
- func Rar(p []byte) bool
- func Rarv5(p []byte) bool
- func Rtf(p []byte) bool
- func RtfNoSuffix(p []byte) bool
- func Tar(p []byte) bool
- func Tiff(p []byte) bool
- func Txt(p []byte) bool
- func TxtLatin1(p []byte) bool
- func TxtWindows(p []byte) bool
- func Utf16(p []byte) bool
- func Utf32(p []byte) bool
- func Utf8(p []byte) bool
- func Wave(p []byte) bool
- func Webp(p []byte) bool
- func Wmv(p []byte) bool
- func X7z(p []byte) bool
- func XZ(p []byte) bool
- func ZStd(p []byte) bool
- func Zip64(p []byte) bool
- func Zoo(p []byte) bool
- type Extension
- type Finder
- type Matcher
- type Signature
- func Archive(r io.Reader) (Signature, error)
- func Archives() []Signature
- func DiscImage(r io.Reader) (Signature, error)
- func DiscImages() []Signature
- func Document(r io.Reader) (Signature, error)
- func Documents() []Signature
- func Find(r io.Reader) (Signature, error)
- func Find512B(r io.Reader) (Signature, error)
- func FindBytes(p []byte) Signature
- func FindBytes512B(p []byte) Signature
- func Image(r io.Reader) (Signature, error)
- func Images() []Signature
- func MatchExt(filename string, r io.Reader) (bool, Signature, error)
- func Program(r io.Reader) (Signature, error)
- func Programs() []Signature
- func Text(r io.Reader) (Signature, error)
- func Texts() []Signature
- func Video(r io.Reader) (Signature, error)
- func Videos() []Signature
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ASCII ¶
ASCII returns true if the byte slice exclusively contains printable ASCII characters. Today, ASCII characters are the first characters of the Unicode character set but historically it was a 7 and 8-bit character encoding standard found on most microcomputers, personal computers, and the early Internet.
func Ansi ¶
Ansi returns true if the byte slice contains some common ANSI escape codes. It for speed and to avoid false positives it only matches the ANSI escape codes for bold, normal and reset text.
func Avif ¶
Avif matches the AV1 Image File image format in the byte slice, also known as AVIF. This is a new image format based on the AV1 video codec from the Alliance for Open Media. But the detection method is not accurate and should be used as a hint.
func DosKWAJ ¶
DosKWAJ returns true if the reader begins with the KWAJ compression signature, found in some DOS executables.
func Gif ¶
Gif matches the image Graphics Interchange Format in the byte slice. There are two versions of the GIF format, GIF87a and GIF89a.
func Hlp ¶
Hlp returns true if the reader contains the Windows Help File signature. This is a generic signature for Windows help files and does not differentiate between the various versions of the help file format.
func ISO ¶
ISO returns true if the reader contains the ISO 9660 CD-ROM filesystem signature. To be accurate, it requires at least 36KB of data to be read.
func Iff ¶
Iff matches the Interchange File Format image in the byte slice. This is a generic wrapper format originally created by Electronic Arts for storing data in chunks.
func Ilbm ¶
Ilbm matches the InterLeaved Bitmap image format in the byte slice. Created by Electronic Arts it conforms to the IFF standard.
func JpegNoSuffix ¶
JpegNoSuffix matches the JPEG File Interchange Format v1 image in the byte slice. This is a less accurate method than Jpeg as it does not check the final bytes.
func NonISO88951 ¶
NonISO88951 returns true if the byte is not a printable ISO/IEC-8895-1 character.
func NonWindows1252 ¶
NonWindows1252 returns true if the byte is not a printable Windows-1252 character.
func NotASCII ¶
NotASCII returns true if the byte is not an printable ASCII character. Most control characters are not printable ASCII characters, but an exception is made for the ESC (escape) character which is used in ANSI escape codes and the EOF (end of file) character which is used in DOS.
func NotPlainText ¶
NotPlainText returns true if the byte is not a printable plain text character. This includes any printable ASCII character as well as any "extended ASCII".
func PdfNoSuffix ¶
PdfNoSuffix returns true if the reader contains the Portable Document Format signature. This is a less accurate method than Pdf as it does not check the final bytes.
func Pklite ¶
Pklite matches the PKLITE archive format in the byte slice which is a compressed executable format for DOS and 16-bit Windows.
func Pksfx ¶
Pksfx matches the PKSFX archive format in the byte slice which is a self-extracting archive format.
func Pkzip ¶
Pkzip matches the PKWARE Zip archive format in the byte slice. This is the most common ZIP format and is widely supported and has been tested against many discountinued and legacy ZIP methods and packagers.
func PkzipMulti ¶
PkzipMulti matches the PKWARE Multi-Volume Zip archive format in the byte slice.
func RtfNoSuffix ¶
RtfNoSuffix returns true if the reader contains the Rich Text Format signature. This is a less accurate method than Rtf as it does not check the final bytes.
func Txt ¶
Txt returns true if the byte slice exclusively contains plain text ASCII characters, control characters or "extended ASCII characters".
func TxtLatin1 ¶
TxtLatin1 returns true if the byte slice exclusively contains plain text ISO/IEC-8895-1 characters, commonly known as the Latin-1 character set.
func TxtWindows ¶
TxtWindows returns true if the byte slice exclusively contains plain text Windows-1252 characters. This is an extension of the Latin-1 character set with additional typography characters and was the default character set for English in Microsoft Windows up to Windows 7?
Types ¶
type Signature ¶
type Signature int
Signature represents a file type signature.
const ( Unknown Signature = iota - 1 ElectronicArtsIFF AV1ImageFile JPEGFileInterchangeFormat JPEG2000 PortableNetworkGraphics GraphicsInterchangeFormat GoogleWebP TaggedImageFileFormat BMPFileFormat PersonalComputereXchange InterleavedBitmap MicrosoftIcon MPEG4 QuickTimeMovie QuickTimeM4V MicrosoftAudioVideoInterleave MicrosoftWindowsMedia MPEG FlashVideo RealPlayer MusicalInstrumentDigitalInterface MPEG1AudioLayer3 OggVorbisCodec FreeLosslessAudioCodec WaveAudioForWindows PKWAREZip64 PKWAREZip PKWAREMultiVolume PKLITE PKSFX TapeARchive RoshalARchive RoshalARchivev5 GzipCompressArchive Bzip2CompressArchive X7zCompressArchive XZCompressArchive ZStandardArchive FreeArc ARChiveSEA YoshiLHA ZooArchive ArchiveRobertJung MicrosoftCABinet MicrosoftDOSKWAJ MicrosoftDOSSZDD MicrosoftExecutable MicrosoftCompoundFile CDISO9660 CDNero CDPowerISO CDAlcohol120 JavaARchive WindowsHelpFile PortableDocumentFormat RichTextFromat UTF8Text UTF16Text UTF32Text ANSIEscapeText PlainText )
func Archive ¶
Archive reads all the bytes from the reader and returns the file type signature if the file is a known archive of files or Unknown if the file is not an archive.
func DiscImage ¶
DiscImage reads all the bytes from the reader and returns the file type signature if the file is a known CD disk image or Unknown if the file is not a disk image.
func DiscImages ¶
func DiscImages() []Signature
DiscImages returns all the CD disk image file type signatures.
func Document ¶
Document reads all the bytes from the reader and returns the file type signature if the file is a known document or Unknown if the file is not a document.
func Find ¶
Find reads all the bytes from the reader and returns the file type signature. Generally, magic numbers are the first few bytes of a file that uniquely identify the file type. But a number of document formats also check the body content or the final few bytes of a file.
func Find512B ¶
Find512B reads the first 512 bytes from the reader and returns the file type signature. This is a less accurate method than Find but should be faster.
func FindBytes512B ¶
FindBytes512B returns the file type signature and skips the magic number checks that require the entire file to be read.
func Image ¶
Image reads all the bytes from the reader and returns the file type signature if the file is a known image or Unknown if the file is not an image.
func MatchExt ¶
MatchExt determines if the reader matches the file type signature expected from the extension of the filename. It returns true if the file type matches and a found signature is always returned.
A PNG encoded image using the filename TEST.PNG will return true and the PortableNetworkGraphics signature. A PNG encoded image using the filename TEST.JPG will return false and the PortableNetworkGraphics signature.
func Program ¶
Program reads all the bytes from the reader and returns the file type signature if the file is a known DOS or Windows program or Unknown if the file is not a program.
func Programs ¶
func Programs() []Signature
Programs returns all the program file type signatures for Microsoft operating systems, DOS and Windows.
func Text ¶
Text reads the first 512 bytes from the reader and returns the file type signature if the file is a known plain text file or Unknown if the file is not a text file.