Usage
Search a dataset on title in a specific collection
from dtotools.search import search_on_title
results = search_on_title(title="koster", collection="emodnet-biology")
print(results)
This will return
[<Item id=bdbeb221-7656-52e5-9ade-4b3304db82cd>]
This Item is a pystac item which can be further explored using PySTAC library .
Search a dataset on title in all collections
from dtotools.search import search_on_title
results = search_on_title(title="koster")
Inspect a parquet file
from dtotools.inspect_parquet import inspect_parquet
inspect_parquet("https://s3.waw3-1.cloudferro.com/emodnet/emodnet_biology/12639/marine_biodiversity_observations_2026-02-26.parquet)
inspect_parquet(
dataset=DATASET_URL,
columns=["parameter"],
filters=[("parameter_imisdasid", [4687])],
output_file="output/inspect_parquet_0.csv"
)
This will result in
column_name,column_type,unique_values
parameter,string,"[{""value"": ""Detritus (#/l)"", ""count"": 27594}, {""value"": ""Diameter_sample_collector_aperture (cm)"", ""count"": 25644}, {""value"": ""Fibres (#/l)"", ""count"": 27594}, {""value"": ""LifeStage"", ""count"": 27552}, {""value"": ""Mesh_size (um)"", ""count"": 25644}, {""value"": ""Samp_vol (l)"", ""count"": 27540}, {""value"": ""sampling_instrument_name"", ""count"": 26007}, {""value"": ""sampling_platform_name"", ""count"": 27927}, {""value"": ""SubSamplingCoefficient (Dmnless)"", ""count"": 27429}, {""value"": ""unidentified_biota (#/l)"", ""count"": 27594}, {""value"": ""WaterAbund (#/ml)"", ""count"": 27582}]"
Read a parquet file
Read a parquet file without filtering:
DATASET_URL = "https://s3.waw3-1.cloudferro.com/emodnet/emodnet_biology/12639/marine_biodiversity_observations_2026-02-26.parquet"
result = read_parquet(parquet=DATASET_URL, max_rows=10)
Read a parquet file with filtering:
DATASET_URL = "https://s3.waw3-1.cloudferro.com/emodnet/emodnet_biology/12639/marine_biodiversity_observations_2026-02-26.parquet"
result = read_parquet(
parquet=DATASET_URL,
# columns=["datasetid"],
filters={"datasetid": 4687},
max_rows=50
)