Loading data into a Table
A Table can be created from a dataset or a schema, the specifics of which are
discussed in the JavaScript section of the user's
guide. In Python, however, Perspective supports additional data types that are
commonly used when processing data:
pandas.DataFramepolars.DataFramebytes(encoding an Apache Arrow)objects(either extracting a repr or via reference)str(encoding as a CSV)
A Table is created in a similar fashion to its JavaScript equivalent:
from datetime import date, datetime
import numpy as np
import pandas as pd
import perspective
data = pd.DataFrame({
"int": np.arange(100),
"float": [i * 1.5 for i in range(100)],
"bool": [True for i in range(100)],
"date": [date.today() for i in range(100)],
"datetime": [datetime.now() for i in range(100)],
"string": [str(i) for i in range(100)]
})
table = perspective.table(data, index="float")
Likewise, a View can be created via the view() method:
view = table.view(group_by=["float"], filter=[["bool", "==", True]])
column_data = view.to_columns()
row_data = view.to_json()
Polars Support
Polars DataFrame types work similarly to Apache Arrow input, which Perspective
uses to interface with Polars.
df = polars.DataFrame({"a": [1,2,3,4,5]})
table = perspective.table(df)
Pandas Support
Perspective's Table can be constructed from pandas.DataFrame objects.
Internally, this just uses
pyarrow::from_pandas,
which dictates behavior of this feature including type support.
If the dataframe does not have an index set, an integer-typed column named
"index" is created. If you want to preserve the indexing behavior of the
dataframe passed into Perspective, simply create the Table with
index="index" as a keyword argument. This tells Perspective to once again
treat the index as a primary key:
data.set_index("datetime")
table = perspective.table(data, index="index")
Time Zone Handling
When parsing "datetime" strings, times are assumed local time unless an
explicit timezone offset is parsed. All "datetime" columns (regardless of
input time zone) are output to the user as datetime.datetime objects in
local time according to the Python runtime.
This behavior is consistent with Perspective's behavior in JavaScript. For more
details, see this in-depth
explanation of
perspective-python semantics around time zone handling.