archive

package module
v1.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2025 License: GPL-3.0 Imports: 17 Imported by: 2

README

Defacto2 / archive

Go Reference Go Report Card

The archive package provides compressed and stored archive file extraction and content listing functions. See the reference documentation for additional usage and examples.

Usage

In your Go project, import the releaser library.

go get github.com/Defacto2/archive

Use the functions.

import "github.com/Defacto2/archive"

func main() {
    // Extract all files from an archive.
    if err := archive.Extract("path/to/archive.zip", "path/to/extract"); err != nil {
        fmt.Println(err)
    }

    // Extract a specific files from an archive.
    x := archive.Extractor{
        Source: "path/to/archive.zip",
        Destination: "path/to/extract",
    }
    if err := x.Extract("file1.txt", "file2.txt"); err != nil {
        fmt.Println(err)
    }

    // Extract all files to a temporary directory.
    path, err := archive.ExtractSource("path/to/archive.zip", "tempsubdir")
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Extracted to:", path)

    // List the contents of an archive.
    files, err := archive.List("path/to/archive.zip", "archive.zip")
    if err != nil {
        fmt.Println(err)
    }
    for _, f := range files {
        fmt.Println(f)
    }

    // Search for a possible readme file within the list of files.
    name := archive.Readme("archive.zip", cont.Files)
    fmt.Println(name)

    // Compress a file into a new archive.
    if _, err := rezip.Compress("file1.txt", "path/to/new.zip"); err != nil {
        fmt.Println(err)
    }
    // Compress a directory into a new archive.
    if _, err = rezip.CompressDir("path/to/directory", "path/to/new.zip"); err != nil {
        fmt.Println(err)
    }
}

Documentation

Overview

Package archive provides compressed and stored archive file extraction and content listing.

The file archive formats supported are 7-Zip, ARC, ARJ, CAB, LHA, LZH, RAR, TAR, compressed TAR, and ZIP.

ZIP includes the deflate, implode, shrink, and store methods.

The package uses following Linux terminal programs for legacy file support.

  1. 7zz - 7-Zip for Linux: console version
  2. arc - arc - pc archive utility
  3. arj - "Open-source ARJ" v3.10
  4. lha - Lhasa v0.4 LHA tool found in the jlha-utils or lhasa packages
  5. hwzip - hwzip for BBS era ZIP file that uses obsolete compression methods
  6. tar - GNU tar
  7. unrar - 6.24 freeware by Alexander Roshal, not the common unrar-free which is feature incomplete
  8. zipinfo - ZipInfo v3 by the Info-ZIP workgroup
  9. gcab - Found with in Linux is in the Gnome msitools package

Index

Examples

Constants

View Source
const (
	TimeoutExtract = 15 * time.Second // TimeoutExtract is the maximum time allowed for the archive extraction.
	TimeoutDefunct = 5 * time.Second  // TimeoutDefunct is the maximum time allowed for the defunct file extraction.
	TimeoutLookup  = 2 * time.Second  // TimeoutLookup is the maximum time allowed for the program list content.

	// WriteWriteRead is the file mode for read and write access.
	// The file owner and group has read and write access, and others have read access.
	WriteWriteRead fs.FileMode = 0o664
)

Variables

View Source
var (
	ErrDest           = errors.New("destination is empty")
	ErrExt            = errors.New("extension is not a supported archive format")
	ErrHLExt          = errors.New("not a valid extension, it must be in the format, .ext")
	ErrNotArchive     = errors.New("file is not an archive")
	ErrNotImplemented = errors.New("archive format is not implemented")
	ErrRead           = errors.New("could not read the file archive")
	ErrProg           = errors.New("program error")
	ErrFile           = errors.New("path is a directory")
	ErrPath           = errors.New("path is a file")
	ErrPanic          = errors.New("extract panic")
	ErrMissing        = errors.New("path does not exist")
	ErrTooMany        = errors.New("will not decompress this archive as it is very large")
)

Functions

func ExtractAll

func ExtractAll(src, dst string) error

ExtractAll extracts all files from the src archive file to the destination directory.

func ExtractSource

func ExtractSource(src, name string) (string, error)

ExtractSource extracts the source file into a temporary directory. The named file is used as part of the extracted directory path. The src is the source file to extract.

func GzipName added in v1.1.0

func GzipName(src string) string

GzipName returns the uncompressed base filename of the gzip archive.

For example, if the base filename is `example.txt.gz`, the uncompressed filename is `example.txt`.

func HardLink(require, src string) (string, error)

HardLink is used to create a hard link to the source file when the filename does not have the required file extension.

This is a workaround for archive programs such as arj which demands the file extension but when the source filename does not have one. The hardlink needs to be removed after usage.

Returns:

  • The absolute path of the hardlink is returned if it is created.
  • An empty string is returned if the source file already has the file extension.
  • An error is returned if the source file cannot be linked.

func List

func List(src, filename string) ([]string, error)

List returns the files within an 7zip, arc, arj, lha/lhz, gzip, rar, tar, zip archive. This filename extension is used to determine the archive format.

func MagicExt

func MagicExt(src string) (string, error)

MagicExt uses the Linux file program to determine the src archive file type. The returned string will be a file separator and extension.

Note both bzip2 and gzip archives now do not return the .tar extension prefix. The detection of tar.gz archives requires the src filename to end with .tar.gz, otherwise the file will be treated as a gzip archive.

func Readme

func Readme(filename string, files ...string) string

Readme returns the best matching scene text README or NFO file from a collection of files. The filename is the name of the archive file, and the files are the list of files in the archive. Note the filename matches are case-insensitive as many handled file archives are created on Windows FAT32, NTFS or MS-DOS FAT16 file systems.

Example
package main

import (
	"fmt"

	"github.com/Defacto2/archive"
)

func main() {
	name := archive.Readme("APP.ZIP", "APP.EXE", "APP.TXT",
		"APP.BIN", "APP.DAT", "STUFF.DAT")
	fmt.Println(name)
}
Output:

APP.TXT

Types

type Content

type Content struct {
	Ext   string   // Ext returns file extension of the archive.
	Files []string // Files returns list of files within the archive.
}

Content are the result of using system programs to read the file archives.

func ListARJ() {
    var c archive.Content
    err := c.ARJ("archive.arj")
    if err != nil {
        fmt.Fprintf(os.Stderr, "error: %v\n", err)
        return
    }
    for name := range slices.Values(c.Files) {
        fmt.Println(name)
    }
}

func (*Content) ARC added in v1.1.0

func (c *Content) ARC(src string) error

ARC returns the content of the src ARC archive. The format once credited to System Enhancement Associates, but now using the arc program by Howard Chu.

func (*Content) ARJ

func (c *Content) ARJ(src string) error

ARJ returns the content of the src ARJ archive. The format credited to Robert Jung using the arj program.

func (*Content) Cab added in v1.1.0

func (c *Content) Cab(src string) error

Cab returns the content of the src Cabinet archive. The format is credited to Microsoft. On Linux the format is handled with the gcab program by Marc-André Lureau.

func (*Content) Gzip added in v1.1.0

func (c *Content) Gzip(src string) error

Gzip returns the uncompressed filename of the gzip archive which is expected to be a single file.

func (*Content) LHA

func (c *Content) LHA(src string) error

LHA returns the content of the src LHA or LZH archive. The format credited to Haruyasu Yoshizaki (Yoshi) using the lha program.

On Linux either the jlha-utils or lhasa work.

func (*Content) Rar

func (c *Content) Rar(src string) error

Rar returns the content of the src RAR archive. The format is credited to Alexander Roshal using the unrar program.

On Linux there are two versions of the unrar program, the freeware version by Alexander Roshal and the feature incomplete [unrar-free]. The freeware version is the recommended program for extracting RAR archives.

func (*Content) Read

func (c *Content) Read(src string) error

Read returns the content of the src file archive using the system archiver programs. The filename is used to determine the archive format.

Supported formats are: 7-zip, arc, arj, Gzip, lha, lzh, rar, tar, zip.

func (*Content) Tar added in v1.1.0

func (c *Content) Tar(src string) error

Tar returns the content of the Tar archive using the bsdtar program.

func (*Content) Zip

func (c *Content) Zip(src string) error

Zip returns the content of the src ZIP archive. The format is credited to Phil Katz using the zipinfo program.

func (*Content) Zip7 added in v1.1.0

func (c *Content) Zip7(src string) error

Zip7 returns the content of the src 7-zip archive. The format credited to Igor Pavlov and using the 7z program.

On some Linux distributions the 7z program is named 7zz. The legacy version of the 7z program, the p7zip package should not be used!

type Extractor

type Extractor struct {
	Source      string // The source archive file.
	Destination string // The extraction destination directory.
}

Extractor uses system archiver programs to extract the targets from the src file archive.

func Extract() {
    x := archive.Extractor{
        Source:      "archive.arj",
        Destination: os.TempDir(),
    }
    err := x.Extract("README.TXT", "INFO.DOC")
    if err != nil {
        fmt.Fprintf(os.Stderr, "error: %v\n", err)
        return
    }
}

func (Extractor) ARC

func (x Extractor) ARC(targets ...string) error

ARC extracts the content of the ARC archive. The format once credited to System Enhancement Associates, but now using the arc program by Howard Chu. If the targets are empty then all files are extracted.

func (Extractor) ARJ

func (x Extractor) ARJ(targets ...string) error

ARJ extracts the targets from the source ARJ archive to the destination directory using the arj program. If the targets are empty then all files are extracted.

func (Extractor) Cab added in v1.1.0

func (x Extractor) Cab() error

Cab decompresses the source archive file to the destination directory. The format is credited to Microsoft. On Linux the format is handled with the gcab program by Marc-André Lureau which does not support targets for extraction.

func (Extractor) Extract

func (x Extractor) Extract(targets ...string) error

Extract the targets from the source file archive to the destination directory a system archive program. If the targets are empty then all files are extracted.

The required Filename string is used to determine the archive format.

The following archive formats do not support targets and will always extract the whole archive.

  • Gzip

Some archive formats that could be impelmented if needed in the future, "freearc", "zoo".

func (Extractor) Generic added in v1.1.0

func (x Extractor) Generic(run Run, targets ...string) error

Generic extracts the targets from the source archive to the destination directory using the specified archive program. If the targets are empty then all files are extracted.

It is used for archive formats that are not widely supported or have a limited feature set including ARC, HWZIP, and others.

These DOS era archive formats are not widely supported. They also does not support extracting to a target directory. To work around this, Generic copies the source archive to the destination directory, uses that as the working directory and extracts the files. The copied source archive is then removed.

func (Extractor) Gzip added in v1.0.6

func (x Extractor) Gzip(targets ...string) error

Gzip decompresses the source archive file to the destination directory. The source file is expected to be a gzip compressed file. Unlike the other container formats, gzip only compresses a single file.

The targets are only used for the tarball gzip (.tar.gz) archive format, otherwise it is ignored.

func (Extractor) LHA

func (x Extractor) LHA(targets ...string) error

LHA extracts the targets from the source LHA/LZH archive. The format credited to Haruyasu Yoshizaki (Yoshi) using the lha program. If the targets are empty then all files are extracted.

On Linux either the jlha-utils or lhasa work.

func (Extractor) Rar

func (x Extractor) Rar(targets ...string) error

Rar extracts the targets from the source RAR archive to the destination directory using the unrar program. If the targets are empty then all files are extracted.

On Linux there are two versions of the unrar program, the freeware version by Alexander Roshal and the feature incomplete [unrar-free]. The freeware version is the recommended program for extracting RAR archives.

func (Extractor) Tar added in v1.1.0

func (x Extractor) Tar(targets ...string) error

Tar extracts the content of the Tar archive using the bsdtar program. If the targets are empty then all files are extracted.

bsdtar uses the performant libarchive library for archive extraction:

gzip, bzip2, compress, xz, lzip, lzma, tar, iso9660, zip, ar, xar, lha/lzh, rar, rar v5, Microsoft Cabinet, 7-zip.

func (Extractor) TempTar added in v1.1.0

func (x Extractor) TempTar(targets ...string) error

TempTar functions like Tar but removes the source tarball after extraction.

func (Extractor) Zip

func (x Extractor) Zip(targets ...string) error

Zip extracts the content of the src ZIP archive. The format is credited to Phil Katz using the unzip program. If the targets are empty then all files are extracted.

func (Extractor) Zip7

func (x Extractor) Zip7(targets ...string) error

Zip7 extracts the targets from the source 7z archive to the destination directory using the 7z program. If the targets are empty then all files are extracted.

On some Linux distributions the 7z program is named 7zz. The legacy version of the 7z program, the p7zip package should not be used!

func (Extractor) ZipHW

func (x Extractor) ZipHW() error

ZipHW extracts the content of the src ZIP archive using the hwzip program. The format is credited to Phil Katz.

Modern unzip only supports the Deflate and Store compression methods.

hwzip supports these legacy PKZIP formats that are not supported anymore:

  • Shrink
  • Reduce
  • Implode

hwzip does not support targets, the extracting of individual files from a zip archive.

func (Extractor) Zips added in v1.1.0

func (x Extractor) Zips(targets ...string) error

Zips attempts to delegate the extraction of the source archive to the correct zip decompression program on the file archive.

Some filenames set by MS-DOS are not valid filenames on modern systems due to the use of codepoints that are not valid in Unicode.

If the ZIP file uses a passphrase an error is returned.

type Finds

type Finds map[string]Usability

Finds are a collection of matched filenames and their usability ranking.

func (Finds) BestMatch

func (f Finds) BestMatch() string

BestMatch returns the most usable filename from a collection of finds.

type Run added in v1.1.0

type Run struct {
	Program string // Program is the archiver program to run, but not the full path.
	Extract string // Extract is the program command to extract files from the archive.
}

Run is a struct that holds the program and extract command for use with the generic extractor.

type Usability

type Usability uint

Usability of search, filename pattern matches.

const (
	// Lvl1 is the highest usability.
	Lvl1 Usability = iota + 1
	Lvl2
	Lvl3
	Lvl4
	Lvl5
	Lvl6
	Lvl7
	Lvl8
	Lvl9 // Lvl9 is the least usable.
)

Directories

Path Synopsis
Package pkzip provides constants and functions for working with PKZip files to determine the compression methods used.
Package pkzip provides constants and functions for working with PKZip files to determine the compression methods used.
Package rezip provides compression for files and directories to create zip archives using the universal Store and Deflate compression methods.
Package rezip provides compression for files and directories to create zip archives using the universal Store and Deflate compression methods.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL