magicnumber

package
v0.8.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 5, 2024 License: GPL-3.0 Imports: 8 Imported by: 0

Documentation

Overview

Package magicnumber contains the magic number matchers for identifying file types that are expected to be handled by the Defacto2 server application. Magic numbers are not always accurate and should be used as hints combined with other checks such as file extension matching.

Usually, the magic number is the first few bytes of a file that uniquely identify the file type. But a number of document formats also check the final few bytes of a file.

At a later stage, the magic number matchers will be used to extract metadata from files and support for module tracking music files will be added.

The sources for the magic numbers byte values are from the following:

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ASCII

func ASCII(p []byte) bool

ASCII returns true if the byte slice exclusively contains printable ASCII characters. Today, ASCII characters are the first characters of the Unicode character set but historically it was a 7 and 8-bit character encoding standard found on most microcomputers, personal computers, and the early Internet.

func Ansi

func Ansi(p []byte) bool

Ansi returns true if the byte slice contains some common ANSI escape codes. It for speed and to avoid false positives it only matches the ANSI escape codes for bold, normal and reset text.

func Arc

func Arc(p []byte) bool

Arc matches the FreeArc compression format in the byte slice.

func ArcArk

func ArcArk(p []byte) bool

ArcArk matches the ARChive SEA compression format in the byte slice.

func Arj

func Arj(p []byte) bool

Arj matches ARJ compression format in the byte slice.

func Avi

func Avi(p []byte) bool

Avi matches the Microsoft Audio Video Interleave video format in the byte slice.

func Avif

func Avif(p []byte) bool

Avif matches the AV1 Image File image format in the byte slice, also known as AVIF. This is a new image format based on the AV1 video codec from the Alliance for Open Media. But the detection method is not accurate and should be used as a hint.

func Bmp

func Bmp(p []byte) bool

Bmp matches the BMP image format in the byte slice.

func Bzip2

func Bzip2(p []byte) bool

Bzip2 matches the Bzip2 Compress archive format in the byte slice.

func Cab

func Cab(p []byte) bool

Cab matches the Microsoft CABinet archive format in the byte slice.

func Daa

func Daa(p []byte) bool

Daa returns true if the reader contains the PowerISO DAA CD image signature.

func DosKWAJ

func DosKWAJ(p []byte) bool

DosKWAJ returns true if the reader begins with the KWAJ compression signature, found in some DOS executables.

func DosSZDD

func DosSZDD(p []byte) bool

DosSZDD returns true if the reader begins with the SZDD compression signature.

func Flac

func Flac(p []byte) bool

Flac matches the Free Lossless Audio Codec audio format in the byte slice.

func Flv

func Flv(p []byte) bool

Flv matches the Shockwave Flash Video format in the byte slice.

func Gif

func Gif(p []byte) bool

Gif matches the image Graphics Interchange Format in the byte slice. There are two versions of the GIF format, GIF87a and GIF89a.

func Gzip

func Gzip(p []byte) bool

Gzip matches the Gzip Compress archive format in the byte slice.

func Hlp

func Hlp(p []byte) bool

Hlp returns true if the reader contains the Windows Help File signature. This is a generic signature for Windows help files and does not differentiate between the various versions of the help file format.

func ISO

func ISO(p []byte) bool

ISO returns true if the reader contains the ISO 9660 CD-ROM filesystem signature. To be accurate, it requires at least 36KB of data to be read.

func Ico

func Ico(p []byte) bool

Ico matches the Microsoft Icon image format in the byte slice.

func Iff

func Iff(p []byte) bool

Iff matches the Interchange File Format image in the byte slice. This is a generic wrapper format originally created by Electronic Arts for storing data in chunks.

func Ilbm

func Ilbm(p []byte) bool

Ilbm matches the InterLeaved Bitmap image format in the byte slice. Created by Electronic Arts it conforms to the IFF standard.

func Ivr

func Ivr(p []byte) bool

Ivr matches the RealPlayer video format in the byte slice.

func Jar

func Jar(p []byte) bool

Jar returns true if the reader contains the Java ARchive signature.

func Jpeg

func Jpeg(p []byte) bool

Jpeg matches the JPEG File Interchange Format v1 image in the byte slice.

func Jpeg2000

func Jpeg2000(p []byte) bool

Jpeg2000 matches the JPEG 2000 image format in the byte slice.

func JpegNoSuffix

func JpegNoSuffix(p []byte) bool

JpegNoSuffix matches the JPEG File Interchange Format v1 image in the byte slice. This is a less accurate method than Jpeg as it does not check the final bytes.

func LzhLha

func LzhLha(p []byte) bool

LzhLha matches the LHA and LZH compression formats in the byte slice.

func M4v

func M4v(p []byte) bool

M4v matches the QuickTime M4V video format in the byte slice.

func MSComp

func MSComp(p []byte) bool

MSComp returns true if the reader contains the Microsoft Compound File signature.

func MSExe

func MSExe(p []byte) bool

MSExe returns true if the reader begins with the Microsoft executable signature.

func Mdf

func Mdf(p []byte) bool

Mdf returns true if the reader contains the Alcohol 120% MDF CD image signature.

func Midi

func Midi(p []byte) bool

func Mp3

func Mp3(p []byte) bool

Mp3 matches the MPEG-1 Audio Layer 3 audio format in the byte slice.

func Mp4

func Mp4(p []byte) bool

Mp4 matches the MPEG-4 video format in the byte slice.

func Mpeg

func Mpeg(p []byte) bool

Mpeg matches the MPEG video format in the byte slice.

func NonISO88951

func NonISO88951(b byte) bool

NonISO88951 returns true if the byte is not a printable ISO/IEC-8895-1 character.

func NonWindows1252

func NonWindows1252(b byte) bool

NonWindows1252 returns true if the byte is not a printable Windows-1252 character.

func NotASCII

func NotASCII(b byte) bool

NotASCII returns true if the byte is not an printable ASCII character. Most control characters are not printable ASCII characters, but an exception is made for the ESC (escape) character which is used in ANSI escape codes and the EOF (end of file) character which is used in DOS.

func NotPlainText

func NotPlainText(b byte) bool

NotPlainText returns true if the byte is not a printable plain text character. This includes any printable ASCII character as well as any "extended ASCII".

func Nri

func Nri(p []byte) bool

Nri returns true if the reader contains the Nero CD image signature.

func Ogg

func Ogg(p []byte) bool

Ogg matches the Ogg Vorbis audio format in the byte slice.

func Pcx

func Pcx(p []byte) bool

Pcx matches the Personal Computer eXchange image format in the byte slice.

func Pdf

func Pdf(p []byte) bool

Pdf returns true if the reader contains the Portable Document Format signature.

func PdfNoSuffix

func PdfNoSuffix(p []byte) bool

PdfNoSuffix returns true if the reader contains the Portable Document Format signature. This is a less accurate method than Pdf as it does not check the final bytes.

func Pklite

func Pklite(p []byte) bool

Pklite matches the PKLITE archive format in the byte slice which is a compressed executable format for DOS and 16-bit Windows.

func Pksfx

func Pksfx(p []byte) bool

Pksfx matches the PKSFX archive format in the byte slice which is a self-extracting archive format.

func Pkzip

func Pkzip(p []byte) bool

Pkzip matches the PKWARE Zip archive format in the byte slice. This is the most common ZIP format and is widely supported and has been tested against many discountinued and legacy ZIP methods and packagers.

func PkzipMulti

func PkzipMulti(p []byte) bool

PkzipMulti matches the PKWARE Multi-Volume Zip archive format in the byte slice.

func Png

func Png(p []byte) bool

Png matches the Portable Network Graphics image format in the byte slice.

func QTMov

func QTMov(p []byte) bool

QTMov matches the QuickTime Movie video format in the byte slice.

func Rar

func Rar(p []byte) bool

Rar matches the Roshal ARchive format in the byte slice.

func Rarv5

func Rarv5(p []byte) bool

Rarv5 matches the Roshal ARchive v5 format in the byte slice.

func Rtf

func Rtf(p []byte) bool

Rtf returns true if the reader contains the Rich Text Format signature.

func RtfNoSuffix

func RtfNoSuffix(p []byte) bool

RtfNoSuffix returns true if the reader contains the Rich Text Format signature. This is a less accurate method than Rtf as it does not check the final bytes.

func Tar

func Tar(p []byte) bool

Tar matches the Tape ARchive format in the byte slice.

func Tiff

func Tiff(p []byte) bool

Tiff matches the Tagged Image File Format in the byte slice.

func Txt

func Txt(p []byte) bool

Txt returns true if the byte slice exclusively contains plain text ASCII characters, control characters or "extended ASCII characters".

func TxtLatin1

func TxtLatin1(p []byte) bool

TxtLatin1 returns true if the byte slice exclusively contains plain text ISO/IEC-8895-1 characters, commonly known as the Latin-1 character set.

func TxtWindows

func TxtWindows(p []byte) bool

TxtWindows returns true if the byte slice exclusively contains plain text Windows-1252 characters. This is an extension of the Latin-1 character set with additional typography characters and was the default character set for English in Microsoft Windows up to Windows 7?

func Utf16

func Utf16(p []byte) bool

Utf16 returns true if the byte slice beings with the UTF-16 Byte Order Mark signature.

func Utf32

func Utf32(p []byte) bool

Utf32 returns true if the byte slice beings with the UTF-32 Byte Order Mark signature.

func Utf8

func Utf8(p []byte) bool

Utf8 returns true if the byte slice beings with the UTF-8 Byte Order Mark signature.

func Wave

func Wave(p []byte) bool

Wave matches the IBM / Microsoft Waveform audio format in the byte slice.

func Webp

func Webp(p []byte) bool

Webp matches the Google WebP image format in the byte slice.

func Wmv

func Wmv(p []byte) bool

Wmv matches the Microsoft Windows Media video format in the byte slice.

func X7z

func X7z(p []byte) bool

X7z matches the 7z Compress archive format in the byte slice.

func XZ

func XZ(p []byte) bool

XZ matches the XZ Compress archive format in the byte slice.

func ZStd

func ZStd(p []byte) bool

ZStd matches the ZStandard archive format in the byte slice.

func Zip64

func Zip64(p []byte) bool

Zip64 matches the PKWARE Zip64 archive format in the byte slice. This is an extension to the original ZIP format that allows for larger files. But it is not widely supported.

func Zoo

func Zoo(p []byte) bool

Zoo matches the Zoo compression format in the byte slice.

Types

type Extension

type Extension map[Signature][]string

Extension is a map of file type signatures to file extensions.

func Ext

func Ext() Extension

Ext returns a map of file type signatures to common file extensions.

type Finder

type Finder map[Signature]Matcher

Finder is a map of file type signatures to matchers.

func New

func New() Finder

New returns a new Finder with all the matchers.

ANSIEscapeText and PlainText are not included as they need to be checked separately and in a specific order.

type Matcher

type Matcher func([]byte) bool

Matcher is a function that matches a byte slice to a file type.

type Signature

type Signature int

Signature represents a file type signature.

const (
	Unknown Signature = iota - 1
	ElectronicArtsIFF
	AV1ImageFile
	JPEGFileInterchangeFormat
	JPEG2000
	PortableNetworkGraphics
	GraphicsInterchangeFormat
	GoogleWebP
	TaggedImageFileFormat
	BMPFileFormat
	PersonalComputereXchange
	InterleavedBitmap
	MicrosoftIcon
	MPEG4
	QuickTimeMovie
	QuickTimeM4V
	MicrosoftAudioVideoInterleave
	MicrosoftWindowsMedia
	MPEG
	FlashVideo
	RealPlayer
	MusicalInstrumentDigitalInterface
	MPEG1AudioLayer3
	OggVorbisCodec
	FreeLosslessAudioCodec
	WaveAudioForWindows
	PKWAREZip64
	PKWAREZip
	PKWAREMultiVolume
	PKLITE
	PKSFX
	TapeARchive
	RoshalARchive
	RoshalARchivev5
	GzipCompressArchive
	Bzip2CompressArchive
	X7zCompressArchive
	XZCompressArchive
	ZStandardArchive
	FreeArc
	ARChiveSEA
	YoshiLHA
	ZooArchive
	ArchiveRobertJung
	MicrosoftCABinet
	MicrosoftDOSKWAJ
	MicrosoftDOSSZDD
	MicrosoftExecutable
	MicrosoftCompoundFile
	CDISO9660
	CDNero
	CDPowerISO
	CDAlcohol120
	JavaARchive
	WindowsHelpFile
	PortableDocumentFormat
	RichTextFromat
	UTF8Text
	UTF16Text
	UTF32Text
	ANSIEscapeText
	PlainText
)

func Archive

func Archive(r io.Reader) (Signature, error)

Archive reads all the bytes from the reader and returns the file type signature if the file is a known archive of files or Unknown if the file is not an archive.

func Archives

func Archives() []Signature

Archives returns all the archive file type signatures.

func DiscImage

func DiscImage(r io.Reader) (Signature, error)

DiscImage reads all the bytes from the reader and returns the file type signature if the file is a known CD disk image or Unknown if the file is not a disk image.

func DiscImages

func DiscImages() []Signature

DiscImages returns all the CD disk image file type signatures.

func Document

func Document(r io.Reader) (Signature, error)

Document reads all the bytes from the reader and returns the file type signature if the file is a known document or Unknown if the file is not a document.

func Documents

func Documents() []Signature

func Find

func Find(r io.Reader) (Signature, error)

Find reads all the bytes from the reader and returns the file type signature. Generally, magic numbers are the first few bytes of a file that uniquely identify the file type. But a number of document formats also check the body content or the final few bytes of a file.

func Find512B

func Find512B(r io.Reader) (Signature, error)

Find512B reads the first 512 bytes from the reader and returns the file type signature. This is a less accurate method than Find but should be faster.

func FindBytes

func FindBytes(p []byte) Signature

FindBytes returns the file type signature from the byte slice.

func FindBytes512B

func FindBytes512B(p []byte) Signature

FindBytes512B returns the file type signature and skips the magic number checks that require the entire file to be read.

func Image

func Image(r io.Reader) (Signature, error)

Image reads all the bytes from the reader and returns the file type signature if the file is a known image or Unknown if the file is not an image.

func Images

func Images() []Signature

Images returns all the image file type signatures.

func MatchExt

func MatchExt(filename string, r io.Reader) (bool, Signature, error)

MatchExt determines if the reader matches the file type signature expected from the extension of the filename. It returns true if the file type matches and a found signature is always returned.

A PNG encoded image using the filename TEST.PNG will return true and the PortableNetworkGraphics signature. A PNG encoded image using the filename TEST.JPG will return false and the PortableNetworkGraphics signature.

func Program

func Program(r io.Reader) (Signature, error)

Program reads all the bytes from the reader and returns the file type signature if the file is a known DOS or Windows program or Unknown if the file is not a program.

func Programs

func Programs() []Signature

Programs returns all the program file type signatures for Microsoft operating systems, DOS and Windows.

func Text

func Text(r io.Reader) (Signature, error)

Text reads the first 512 bytes from the reader and returns the file type signature if the file is a known plain text file or Unknown if the file is not a text file.

func Texts

func Texts() []Signature

Texts returns all the text file type signatures.

func Video

func Video(r io.Reader) (Signature, error)

Video reads all the bytes from the reader and returns the file type signature if the file is a known video or Unknown if the file is not a video.

func Videos

func Videos() []Signature

Videos returns all the video file type signatures.

Directories

Path Synopsis
Package pkzip provides constants and functions for working with PKZip files to determine the compression methods used.
Package pkzip provides constants and functions for working with PKZip files to determine the compression methods used.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL