hup.io.csv module¶
File I/O for text files containing delimiter-separated values.
The delimiter-separated values format is a family of file formats, used for
the storage of tabular data. In it’s most common variant, the comma-separated
values format, the format was used many years prior to attempts to it’s
standardization in RFC 4180, such that subtle differences often exist in the
data produced and consumed by different applications. This circumstance has
and basically been addressed by PEP 305 and the standard library module
csv
. The current module extends the capabilities of the standard library
by I/O handling of file references, support of
non-standard CSV headers, as used in CSV exports of the R programming
language, automation in CSV parameter detection and row names.
-
class
File
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], header: Optional[Iterable[str]] = None, comment: Optional[str] = None, dialect: Union[str, csv.Dialect, None] = None, delimiter: Optional[str] = None, namecol: Optional[str] = None, hformat: Optional[int] = None)¶ Bases:
hup.base.attrib.Group
File Class for text files containing delimiter-separated values.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in reading or writing mode. - header – Optional list (or arbitrary iterable) of strings, that specify the column names within the CSV file. For an existing file, the header by default is extracted from the first content line (not blank and not starting with #). For a new file the header is required and an error is raised if the header is not given.
- comment – Optional string, which precedes the header and the rows of the CSV file, e.g. to include metadata within the file. For an existing file, the string by default is extracted from the initial comment lines (starting with #). For a new file the comment by default is empty.
- dialect – Optional parameter, that indicates the used CSV dialect. The
parameter can be given as a dialect name from the list returned by
the function
csv.list_dialects()
, or an instance of the classcsv.Dialect
. - delimiter – Single character, which is used to separetate the column values within the CSV file. For an existing file, the delimiter by default is detected from it’s appearance within the file. For a new file the default value is ,.
- namecol – Optional column name of a column, which contains row names. If a valid column is given, then the readonly attribute rownames returns this column as a list. By default the column is infered from the used header format. For a RFC 4180 compliant header by default no row names are used. For a header as used in exports of the R programming language by default the first column is used to store row names.
- hformat –
Used CSV Header format. The following formats are supported: 0: RFC 4180:
The column header represents the structure of the rows.- 1: R programming language:
- The column header does not include the first column of the rows. This follows by the convention, that in the R programming language the CSV export adds an extra column with row names as the first column, which is omitted within the CSV header.
-
close
() → None¶ Close all opened file handlers of CSV File.
-
comment
¶ String containing the initial ‘#’ lines of the CSV file or an empty string, if no initial comment lines could be detected.
-
delimiter
¶ Delimiter character of the CSV file or None, if for an existing file the delimiter could not be detected.
-
dialect
= None¶
-
fields
= None¶
-
header
¶ List of strings containing column names from first non comment, non empty line of CSV file.
-
hformat
¶ - RFC 4180:
- The column header represents the structure of the rows.
- 1: R programming language:
- The column header does not include the first column of the rows. This follows by the convention, that in the R programming language the CSV export adds an extra column with row names as the first column, which is omitted within the CSV header.
Type: CSV Header format. The following formats are supported Type: 0
-
name
= None¶
-
namecol
¶ Readonly name of column, that contains the row names. By default the column is infered from the used header format. For a RFC 4180 compliant header by default no row names are used. For a header as used in exports of the R programming language by default the name of the first column is returned.
-
open
(mode: str = 'r', columns: Optional[Tuple[str, ...]] = None) → hup.io.csv.HandlerBase¶ Open CSV file in reading or writing mode.
Parameters: - mode – String, which characters specify the mode in which the file is to be opened. The default mode is reading mode, which is indicated by the character r. The character w indicates writing mode. Thereby reading- and writing mode are exclusive and can not be used together.
- columns – Has no effect in writing mode. For reading mode it specifies the columns, which are return from the CSV file by their respective column names. By default all columns are returned.
Returns: In reading mode (if mode contains the character w) an instance of the class
Reader
is returned and in writing mode (if mode contains the character r) an instance of the classWriter
is returned.
-
read
(columns: Optional[Tuple[str, ...]] = None) → List[tuple]¶ Read all rows from current CSV file.
Parameters: columns – Specifies the columns, which are return from the CSV file by their respective column names. By default all columns are returned. Returns: List of tuples, which contain the values of the specified columns.
-
rownames
= None¶
-
write
(rows: List[tuple]) → None¶ Write rows to current CSV file.
Parameters: rows – List of tuples (or arbitrary iterables), which respectively contain the values of a single row.
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
-
class
HandlerBase
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], mode: str = 'r')¶ Bases:
abc.ABC
CSV file I/O Handler Base Class.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in reading mode. - mode – String, which characters specify the mode in which the file is to be opened. The default mode is reading mode, which is indicated by the character r. The character w indicates writing mode. Thereby reading- and writing mode are exclusive and can not be used together.
-
close
() → None¶ Close the CSV file handler.
-
read_row
() → tuple¶ Read a single row from the referenced file as a tuple.
-
read_rows
() → List[tuple]¶ Read multiple rows from the referenced file as a list of tuples.
-
write_row
(row: Iterable[T_co]) → None¶ Write a single row to the referenced file from a tuple.
-
write_rows
(rows: Iterable[tuple]) → None¶ Write multiple rows to the referenced file from a list of tuples.
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
-
class
Reader
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], skiprows: int, usecols: Optional[Tuple[int, ...]], fields: List[Tuple[str, type]], **kwds)¶ Bases:
hup.io.csv.HandlerBase
CSV file I/O Reader Class.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in reading mode. - skiprows – Number of initial lines within the given CSV file before the CSV Header. By default no lines are skipped.
- usecols – Tuple with column IDs of the columns, which are imported from the given CSV file. By default all columns are imported.
- fields – List (or arbitrary iterable) of field descriptors, respectively given by a tuple, containing a column name and a column type.
- **kwds – Formatting parameters used by
csv
. See also Dialects and formatting parameters
-
read_row
() → tuple¶ Read a single row from the referenced file as a tuple.
-
read_rows
() → List[tuple]¶ Read multiple rows from the referenced file as a list of tuples.
-
write_row
(row: Iterable[T_co]) → None¶ Write a single row to the referenced file from a tuple.
-
write_rows
(rows: Iterable[tuple]) → None¶ Write multiple rows to the referenced file from a list of tuples.
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
-
class
Writer
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], header: Iterable[str], comment: str = '', **kwds)¶ Bases:
hup.io.csv.HandlerBase
CSV file I/O Writer Class.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in writing mode. - header – List (or arbitrary iterable) of column names, that specify the header of the CSV file.
- comment – Initial comment of the CSV file.
- **kwds – Formatting parameters used by
csv
. See also Dialects and formatting parameters
-
read_row
() → tuple¶ Read a single row from the referenced file as a tuple.
-
read_rows
() → List[tuple]¶ Read multiple rows from the referenced file as a list of tuples.
-
write_row
(row: Iterable[T_co]) → None¶ Write a single row to the referenced file from a tuple.
-
write_rows
(rows: Iterable[tuple]) → None¶ Write multiple rows to the referenced file from a list of tuples.
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
-
load
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], delimiter: Optional[str] = None, hformat: Optional[int] = None) → hup.io.csv.File¶ Load CSV file.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in reading or writing mode. - delimiter – Single character, which is used to separetate the column values within the CSV file. By default the delimiter is detected from it’s appearance within the file.
- hformat –
Returns: Instance of class
hup.io.csv.File
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
-
save
(file: Union[IO[Any], os.PathLike, hup.io.abc.Connector], header: Iterable[str], values: List[tuple], comment: Optional[str] = None, delimiter: Optional[str] = None, hformat: Optional[int] = None) → None¶ Save data to CSV file.
Parameters: - file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class
Connector
or an opened file object in reading or writing mode. - header – Optional list (or arbitrary iterable) of strings, that specify the column names within the CSV file. For an existing file, the header by default is extracted from the first content line (not blank and not starting with #). For a new file the header is required and an error is raised if the header is not given.
- comment – Optional string, which precedes the header and the rows of the CSV file, e.g. to include metadata within the file. For an existing file, the string by default is extracted from the initial comment lines (starting with #). For a new file the comment by default is empty.
- delimiter – Single character, which is used to separetate the column values within the CSV file. For an existing file, the delimiter by default is detected from it’s appearance within the file. For a new file the default value is ,.
- hformat –
- file – File reference to a file object. The reference can
ether be given as a String or path-like object, that points
to a valid entry in the file system, an instance of the class