Get Dataset
This task uses short polling to get the dataset from Apify. If this task receives an empty dataset, it will retry with exponential back-off until the dataset becomes available or the timeout limit is reached. By default, the task will time out after 300 seconds to prevent it from hanging. When this task receives a empty dataset it is typically because the actor run has not finished uploading the Dataset.
type: "io.kestra.plugin.apify.dataset.Get"Examples
Get dataset with a given id.
id: apify_get_dataset_flow_required_properties
namespace: company.team
tasks:
  - id: list_runs
    type: io.kestra.plugin.apify.dataset.Get
    apiToken: "{{ secret('APIFY_API_TOKEN') }}"
    datasetId: mecGriFjtDHRNtYOZ
Get dataset with a given id and specific options.
id: apify_get_dataset_flow
namespace: company.team
tasks:
  - id: list_runs
    type: io.kestra.plugin.apify.dataset.Get
    apiToken: "{{ secret('APIFY_API_TOKEN') }}"
    datasetId: RNtYOZmecGriFjtDH
    clean: false
    offset: 1
    limit: 10
    fields: userId, #id, #createdAt, postMeta
    omit: #id
    flatten: postMeta
    sort: ASC
    skipEmpty: false
Properties
apiToken *Requiredstring
Apify API token
Api Token for Apify. You can find it in your Apify account settings.
datasetId *Requiredstring
datasetId
clean booleanstring
trueClean
If true then the task returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The default value is true.
fields array
Fields
List of fields which should be picked from the returned items, only these fields will remain in the resulting record objects.
flatten booleanstring
falseFlatten
List of fields which should transform nested objects into flat structures. For example, with flatten="foo" the object {"foo": {"bar": "hello"}} is turned into {"foo.bar": "hello"}.
limit integerstring
1000Limit
Maximum number of items to return. By default Limit value is set to 1000.
offset integerstring
0Offset
Number of items that should be skipped at the start. The default value is 0.
omit array
Omit
List of fields which should be omitted from the returned items.
options Non-dynamicHttpConfiguration
The HTTP client configuration.
simplified booleanstring
falseSimplified
If true then hidden fields are skipped from the output, i.e. fields starting with the # character.
skipEmpty booleanstring
trueSkipEmpty
If true then empty items are skipped from the output. Default value is true.
skipFailedPages booleanstring
falseSkipFailedPages
If true then, the all the items with errorInfo property will be skipped from the output. Default value false.
skipHidden booleanstring
falseSkipHidden
If true then hidden fields are skipped from the output, i.e. fields starting with the # character.
sort string
ASCASCDESCsort
Sort the runs by startedAt in descending order. Defaults to ASC.
unwind array
Unwind
A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
view string
View
Defines the view configuration for dataset items based on the schema definition. This parameter determines how the data will be filtered and presented. For complete specification details, see the dataset schema documentation in the Apify documentation.
Outputs
dataset array
Definitions
io.kestra.core.http.client.configurations.TimeoutConfiguration
connectTimeout string
durationThe time allowed to establish a connection to the server before failing.
readIdleTimeout string
PT5MdurationThe time allowed for a read connection to remain idle before closing it.
io.kestra.core.http.client.configurations.BasicAuthConfiguration
type *Requiredobject
password string
The password for HTTP basic authentication.
username string
The username for HTTP basic authentication.
java.nio.charset.Charset
io.kestra.core.http.client.configurations.HttpConfiguration
allowFailed booleanstring
falseIf true, allow a failed response code (response code >= 400)
allowedResponseCodes array
List of response code allowed for this request
auth BasicAuthConfigurationBearerAuthConfiguration
The authentification to use.
defaultCharset Charsetstring
UTF-8The default charset for the request.
followRedirects booleanstring
trueWhether redirects should be followed automatically.
logs array
REQUEST_HEADERSREQUEST_BODYRESPONSE_HEADERSRESPONSE_BODYThe enabled log.
proxy ProxyConfiguration
The proxy configuration.
ssl SslOptions
The SSL request options
timeout TimeoutConfiguration
The timeout configuration.
io.kestra.core.http.client.configurations.ProxyConfiguration
address string
The address of the proxy server.
password string
The password for proxy authentication.
port integerstring
The port of the proxy server.
type string
DIRECTDIRECTHTTPSOCKSThe type of proxy to use.
username string
The username for proxy authentication.
io.kestra.core.http.client.configurations.SslOptions
insecureTrustAllCertificates booleanstring
Whether to disable checking of the remote SSL certificate.
Only applies if no trust store is configured. Note: This makes the SSL connection insecure and should only be used for testing. If you are using a self-signed certificate, set up a trust store instead.
io.kestra.core.http.client.configurations.BearerAuthConfiguration
type *Requiredobject
token string
The token for bearer token authentication.