Glue
This page documents function available when using the Glue
module, created with @service Glue
.
Index
Main.Glue.batch_create_partition
Main.Glue.batch_delete_connection
Main.Glue.batch_delete_partition
Main.Glue.batch_delete_table
Main.Glue.batch_delete_table_version
Main.Glue.batch_get_blueprints
Main.Glue.batch_get_crawlers
Main.Glue.batch_get_custom_entity_types
Main.Glue.batch_get_data_quality_result
Main.Glue.batch_get_dev_endpoints
Main.Glue.batch_get_jobs
Main.Glue.batch_get_partition
Main.Glue.batch_get_table_optimizer
Main.Glue.batch_get_triggers
Main.Glue.batch_get_workflows
Main.Glue.batch_stop_job_run
Main.Glue.batch_update_partition
Main.Glue.cancel_data_quality_rule_recommendation_run
Main.Glue.cancel_data_quality_ruleset_evaluation_run
Main.Glue.cancel_mltask_run
Main.Glue.cancel_statement
Main.Glue.check_schema_version_validity
Main.Glue.create_blueprint
Main.Glue.create_classifier
Main.Glue.create_connection
Main.Glue.create_crawler
Main.Glue.create_custom_entity_type
Main.Glue.create_data_quality_ruleset
Main.Glue.create_database
Main.Glue.create_dev_endpoint
Main.Glue.create_job
Main.Glue.create_mltransform
Main.Glue.create_partition
Main.Glue.create_partition_index
Main.Glue.create_registry
Main.Glue.create_schema
Main.Glue.create_script
Main.Glue.create_security_configuration
Main.Glue.create_session
Main.Glue.create_table
Main.Glue.create_table_optimizer
Main.Glue.create_trigger
Main.Glue.create_usage_profile
Main.Glue.create_user_defined_function
Main.Glue.create_workflow
Main.Glue.delete_blueprint
Main.Glue.delete_classifier
Main.Glue.delete_column_statistics_for_partition
Main.Glue.delete_column_statistics_for_table
Main.Glue.delete_connection
Main.Glue.delete_crawler
Main.Glue.delete_custom_entity_type
Main.Glue.delete_data_quality_ruleset
Main.Glue.delete_database
Main.Glue.delete_dev_endpoint
Main.Glue.delete_job
Main.Glue.delete_mltransform
Main.Glue.delete_partition
Main.Glue.delete_partition_index
Main.Glue.delete_registry
Main.Glue.delete_resource_policy
Main.Glue.delete_schema
Main.Glue.delete_schema_versions
Main.Glue.delete_security_configuration
Main.Glue.delete_session
Main.Glue.delete_table
Main.Glue.delete_table_optimizer
Main.Glue.delete_table_version
Main.Glue.delete_trigger
Main.Glue.delete_usage_profile
Main.Glue.delete_user_defined_function
Main.Glue.delete_workflow
Main.Glue.get_blueprint
Main.Glue.get_blueprint_run
Main.Glue.get_blueprint_runs
Main.Glue.get_catalog_import_status
Main.Glue.get_classifier
Main.Glue.get_classifiers
Main.Glue.get_column_statistics_for_partition
Main.Glue.get_column_statistics_for_table
Main.Glue.get_column_statistics_task_run
Main.Glue.get_column_statistics_task_runs
Main.Glue.get_connection
Main.Glue.get_connections
Main.Glue.get_crawler
Main.Glue.get_crawler_metrics
Main.Glue.get_crawlers
Main.Glue.get_custom_entity_type
Main.Glue.get_data_catalog_encryption_settings
Main.Glue.get_data_quality_result
Main.Glue.get_data_quality_rule_recommendation_run
Main.Glue.get_data_quality_ruleset
Main.Glue.get_data_quality_ruleset_evaluation_run
Main.Glue.get_database
Main.Glue.get_databases
Main.Glue.get_dataflow_graph
Main.Glue.get_dev_endpoint
Main.Glue.get_dev_endpoints
Main.Glue.get_job
Main.Glue.get_job_bookmark
Main.Glue.get_job_run
Main.Glue.get_job_runs
Main.Glue.get_jobs
Main.Glue.get_mapping
Main.Glue.get_mltask_run
Main.Glue.get_mltask_runs
Main.Glue.get_mltransform
Main.Glue.get_mltransforms
Main.Glue.get_partition
Main.Glue.get_partition_indexes
Main.Glue.get_partitions
Main.Glue.get_plan
Main.Glue.get_registry
Main.Glue.get_resource_policies
Main.Glue.get_resource_policy
Main.Glue.get_schema
Main.Glue.get_schema_by_definition
Main.Glue.get_schema_version
Main.Glue.get_schema_versions_diff
Main.Glue.get_security_configuration
Main.Glue.get_security_configurations
Main.Glue.get_session
Main.Glue.get_statement
Main.Glue.get_table
Main.Glue.get_table_optimizer
Main.Glue.get_table_version
Main.Glue.get_table_versions
Main.Glue.get_tables
Main.Glue.get_tags
Main.Glue.get_trigger
Main.Glue.get_triggers
Main.Glue.get_unfiltered_partition_metadata
Main.Glue.get_unfiltered_partitions_metadata
Main.Glue.get_unfiltered_table_metadata
Main.Glue.get_usage_profile
Main.Glue.get_user_defined_function
Main.Glue.get_user_defined_functions
Main.Glue.get_workflow
Main.Glue.get_workflow_run
Main.Glue.get_workflow_run_properties
Main.Glue.get_workflow_runs
Main.Glue.import_catalog_to_glue
Main.Glue.list_blueprints
Main.Glue.list_column_statistics_task_runs
Main.Glue.list_crawlers
Main.Glue.list_crawls
Main.Glue.list_custom_entity_types
Main.Glue.list_data_quality_results
Main.Glue.list_data_quality_rule_recommendation_runs
Main.Glue.list_data_quality_ruleset_evaluation_runs
Main.Glue.list_data_quality_rulesets
Main.Glue.list_dev_endpoints
Main.Glue.list_jobs
Main.Glue.list_mltransforms
Main.Glue.list_registries
Main.Glue.list_schema_versions
Main.Glue.list_schemas
Main.Glue.list_sessions
Main.Glue.list_statements
Main.Glue.list_table_optimizer_runs
Main.Glue.list_triggers
Main.Glue.list_usage_profiles
Main.Glue.list_workflows
Main.Glue.put_data_catalog_encryption_settings
Main.Glue.put_resource_policy
Main.Glue.put_schema_version_metadata
Main.Glue.put_workflow_run_properties
Main.Glue.query_schema_version_metadata
Main.Glue.register_schema_version
Main.Glue.remove_schema_version_metadata
Main.Glue.reset_job_bookmark
Main.Glue.resume_workflow_run
Main.Glue.run_statement
Main.Glue.search_tables
Main.Glue.start_blueprint_run
Main.Glue.start_column_statistics_task_run
Main.Glue.start_crawler
Main.Glue.start_crawler_schedule
Main.Glue.start_data_quality_rule_recommendation_run
Main.Glue.start_data_quality_ruleset_evaluation_run
Main.Glue.start_export_labels_task_run
Main.Glue.start_import_labels_task_run
Main.Glue.start_job_run
Main.Glue.start_mlevaluation_task_run
Main.Glue.start_mllabeling_set_generation_task_run
Main.Glue.start_trigger
Main.Glue.start_workflow_run
Main.Glue.stop_column_statistics_task_run
Main.Glue.stop_crawler
Main.Glue.stop_crawler_schedule
Main.Glue.stop_session
Main.Glue.stop_trigger
Main.Glue.stop_workflow_run
Main.Glue.tag_resource
Main.Glue.untag_resource
Main.Glue.update_blueprint
Main.Glue.update_classifier
Main.Glue.update_column_statistics_for_partition
Main.Glue.update_column_statistics_for_table
Main.Glue.update_connection
Main.Glue.update_crawler
Main.Glue.update_crawler_schedule
Main.Glue.update_data_quality_ruleset
Main.Glue.update_database
Main.Glue.update_dev_endpoint
Main.Glue.update_job
Main.Glue.update_job_from_source_control
Main.Glue.update_mltransform
Main.Glue.update_partition
Main.Glue.update_registry
Main.Glue.update_schema
Main.Glue.update_source_control_from_job
Main.Glue.update_table
Main.Glue.update_table_optimizer
Main.Glue.update_trigger
Main.Glue.update_usage_profile
Main.Glue.update_user_defined_function
Main.Glue.update_workflow
Documentation
Main.Glue.batch_create_partition
— Methodbatch_create_partition(database_name, partition_input_list, table_name)
batch_create_partition(database_name, partition_input_list, table_name, params::Dict{String,<:Any})
Creates one or more partitions in a batch operation.
Arguments
database_name
: The name of the metadata database in which the partition is to be created.partition_input_list
: A list of PartitionInput structures that define the partitions to be created.table_name
: The name of the metadata table in which the partition is to be created.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the catalog in which the partition is to be created. Currently, this should be the Amazon Web Services account ID.
Main.Glue.batch_delete_connection
— Methodbatch_delete_connection(connection_name_list)
batch_delete_connection(connection_name_list, params::Dict{String,<:Any})
Deletes a list of connection definitions from the Data Catalog.
Arguments
connection_name_list
: A list of names of the connections to delete.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.batch_delete_partition
— Methodbatch_delete_partition(database_name, partitions_to_delete, table_name)
batch_delete_partition(database_name, partitions_to_delete, table_name, params::Dict{String,<:Any})
Deletes one or more partitions in a batch operation.
Arguments
database_name
: The name of the catalog database in which the table in question resides.partitions_to_delete
: A list of PartitionInput structures that define the partitions to be deleted.table_name
: The name of the table that contains the partitions to be deleted.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.batch_delete_table
— Methodbatch_delete_table(database_name, tables_to_delete)
batch_delete_table(database_name, tables_to_delete, params::Dict{String,<:Any})
Deletes multiple tables at once. After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service. To ensure the immediate deletion of all related resources, before calling BatchDeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.
Arguments
database_name
: The name of the catalog database in which the tables to delete reside. For Hive compatibility, this name is entirely lowercase.tables_to_delete
: A list of the table to delete.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default."TransactionId"
: The transaction ID at which to delete the table contents.
Main.Glue.batch_delete_table_version
— Methodbatch_delete_table_version(database_name, table_name, version_ids)
batch_delete_table_version(database_name, table_name, version_ids, params::Dict{String,<:Any})
Deletes a specified batch of versions of a table.
Arguments
database_name
: The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.table_name
: The name of the table. For Hive compatibility, this name is entirely lowercase.version_ids
: A list of the IDs of versions to be deleted. A VersionId is a string representation of an integer. Each version is incremented by 1.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.batch_get_blueprints
— Methodbatch_get_blueprints(names)
batch_get_blueprints(names, params::Dict{String,<:Any})
Retrieves information about a list of blueprints.
Arguments
names
: A list of blueprint names.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeBlueprint"
: Specifies whether or not to include the blueprint in the response."IncludeParameterSpec"
: Specifies whether or not to include the parameters, as a JSON string, for the blueprint in the response.
Main.Glue.batch_get_crawlers
— Methodbatch_get_crawlers(crawler_names)
batch_get_crawlers(crawler_names, params::Dict{String,<:Any})
Returns a list of resource metadata for a given list of crawler names. After calling the ListCrawlers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Arguments
crawler_names
: A list of crawler names, which might be the names returned from the ListCrawlers operation.
Main.Glue.batch_get_custom_entity_types
— Methodbatch_get_custom_entity_types(names)
batch_get_custom_entity_types(names, params::Dict{String,<:Any})
Retrieves the details for the custom patterns specified by a list of names.
Arguments
names
: A list of names of the custom patterns that you want to retrieve.
Main.Glue.batch_get_data_quality_result
— Methodbatch_get_data_quality_result(result_ids)
batch_get_data_quality_result(result_ids, params::Dict{String,<:Any})
Retrieves a list of data quality results for the specified result IDs.
Arguments
result_ids
: A list of unique result IDs for the data quality results.
Main.Glue.batch_get_dev_endpoints
— Methodbatch_get_dev_endpoints(dev_endpoint_names)
batch_get_dev_endpoints(dev_endpoint_names, params::Dict{String,<:Any})
Returns a list of resource metadata for a given list of development endpoint names. After calling the ListDevEndpoints operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Arguments
dev_endpoint_names
: The list of DevEndpoint names, which might be the names returned from the ListDevEndpoint operation.
Main.Glue.batch_get_jobs
— Methodbatch_get_jobs(job_names)
batch_get_jobs(job_names, params::Dict{String,<:Any})
Returns a list of resource metadata for a given list of job names. After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Arguments
job_names
: A list of job names, which might be the names returned from the ListJobs operation.
Main.Glue.batch_get_partition
— Methodbatch_get_partition(database_name, partitions_to_get, table_name)
batch_get_partition(database_name, partitions_to_get, table_name, params::Dict{String,<:Any})
Retrieves partitions in a batch request.
Arguments
database_name
: The name of the catalog database where the partitions reside.partitions_to_get
: A list of partition values identifying the partitions to retrieve.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.batch_get_table_optimizer
— Methodbatch_get_table_optimizer(entries)
batch_get_table_optimizer(entries, params::Dict{String,<:Any})
Returns the configuration for the specified table optimizers.
Arguments
entries
: A list of BatchGetTableOptimizerEntry objects specifying the table optimizers to retrieve.
Main.Glue.batch_get_triggers
— Methodbatch_get_triggers(trigger_names)
batch_get_triggers(trigger_names, params::Dict{String,<:Any})
Returns a list of resource metadata for a given list of trigger names. After calling the ListTriggers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Arguments
trigger_names
: A list of trigger names, which may be the names returned from the ListTriggers operation.
Main.Glue.batch_get_workflows
— Methodbatch_get_workflows(names)
batch_get_workflows(names, params::Dict{String,<:Any})
Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Arguments
names
: A list of workflow names, which may be the names returned from the ListWorkflows operation.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeGraph"
: Specifies whether to include a graph when returning the workflow resource metadata.
Main.Glue.batch_stop_job_run
— Methodbatch_stop_job_run(job_name, job_run_ids)
batch_stop_job_run(job_name, job_run_ids, params::Dict{String,<:Any})
Stops one or more job runs for a specified job definition.
Arguments
job_name
: The name of the job definition for which to stop job runs.job_run_ids
: A list of the JobRunIds that should be stopped for that job definition.
Main.Glue.batch_update_partition
— Methodbatch_update_partition(database_name, entries, table_name)
batch_update_partition(database_name, entries, table_name, params::Dict{String,<:Any})
Updates one or more partitions in a batch operation.
Arguments
database_name
: The name of the metadata database in which the partition is to be updated.entries
: A list of up to 100 BatchUpdatePartitionRequestEntry objects to update.table_name
: The name of the metadata table in which the partition is to be updated.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the catalog in which the partition is to be updated. Currently, this should be the Amazon Web Services account ID.
Main.Glue.cancel_data_quality_rule_recommendation_run
— Methodcancel_data_quality_rule_recommendation_run(run_id)
cancel_data_quality_rule_recommendation_run(run_id, params::Dict{String,<:Any})
Cancels the specified recommendation run that was being used to generate rules.
Arguments
run_id
: The unique run identifier associated with this run.
Main.Glue.cancel_data_quality_ruleset_evaluation_run
— Methodcancel_data_quality_ruleset_evaluation_run(run_id)
cancel_data_quality_ruleset_evaluation_run(run_id, params::Dict{String,<:Any})
Cancels a run where a ruleset is being evaluated against a data source.
Arguments
run_id
: The unique run identifier associated with this run.
Main.Glue.cancel_mltask_run
— Methodcancel_mltask_run(task_run_id, transform_id)
cancel_mltask_run(task_run_id, transform_id, params::Dict{String,<:Any})
Cancels (stops) a task run. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun with a task run's parent transform's TransformID and the task run's TaskRunId.
Arguments
task_run_id
: A unique identifier for the task run.transform_id
: The unique identifier of the machine learning transform.
Main.Glue.cancel_statement
— Methodcancel_statement(id, session_id)
cancel_statement(id, session_id, params::Dict{String,<:Any})
Cancels the statement.
Arguments
id
: The ID of the statement to be cancelled.session_id
: The Session ID of the statement to be cancelled.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The origin of the request to cancel the statement.
Main.Glue.check_schema_version_validity
— Methodcheck_schema_version_validity(data_format, schema_definition)
check_schema_version_validity(data_format, schema_definition, params::Dict{String,<:Any})
Validates the supplied schema. This call has no side effects, it simply validates using the supplied schema using DataFormat as the format. Since it does not take a schema set name, no compatibility checks are performed.
Arguments
data_format
: The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.schema_definition
: The definition of the schema that has to be validated.
Main.Glue.create_blueprint
— Methodcreate_blueprint(blueprint_location, name)
create_blueprint(blueprint_location, name, params::Dict{String,<:Any})
Registers a blueprint with Glue.
Arguments
blueprint_location
: Specifies a path in Amazon S3 where the blueprint is published.name
: The name of the blueprint.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the blueprint."Tags"
: The tags to be applied to this blueprint.
Main.Glue.create_classifier
— Methodcreate_classifier()
create_classifier(params::Dict{String,<:Any})
Creates a classifier in the user's account. This can be a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field of the request is present.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CsvClassifier"
: A CsvClassifier object specifying the classifier to create."GrokClassifier"
: A GrokClassifier object specifying the classifier to create."JsonClassifier"
: A JsonClassifier object specifying the classifier to create."XMLClassifier"
: An XMLClassifier object specifying the classifier to create.
Main.Glue.create_connection
— Methodcreate_connection(connection_input)
create_connection(connection_input, params::Dict{String,<:Any})
Creates a connection definition in the Data Catalog. Connections used for creating federated resources require the IAM glue:PassConnection permission.
Arguments
connection_input
: A ConnectionInput object defining the connection to create.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which to create the connection. If none is provided, the Amazon Web Services account ID is used by default."Tags"
: The tags you assign to the connection.
Main.Glue.create_crawler
— Methodcreate_crawler(name, role, targets)
create_crawler(name, role, targets, params::Dict{String,<:Any})
Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.
Arguments
name
: Name of the new crawler.role
: The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.targets
: A list of collection of targets to crawl.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Classifiers"
: A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification."Configuration"
: Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options."CrawlerSecurityConfiguration"
: The name of the SecurityConfiguration structure to be used by this crawler."DatabaseName"
: The Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*."Description"
: A description of the new crawler."LakeFormationConfiguration"
: Specifies Lake Formation configuration settings for the crawler."LineageConfiguration"
: Specifies data lineage configuration settings for the crawler."RecrawlPolicy"
: A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run."Schedule"
: A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *)."SchemaChangePolicy"
: The policy for the crawler's update and deletion behavior."TablePrefix"
: The table prefix used for catalog tables that are created."Tags"
: The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.
Main.Glue.create_custom_entity_type
— Methodcreate_custom_entity_type(name, regex_string)
create_custom_entity_type(name, regex_string, params::Dict{String,<:Any})
Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data. Each custom pattern you create specifies a regular expression and an optional list of context words. If no context words are passed only a regular expression is checked.
Arguments
name
: A name for the custom pattern that allows it to be retrieved or deleted later. This name must be unique per Amazon Web Services account.regex_string
: A regular expression string that is used for detecting sensitive data in a custom pattern.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"ContextWords"
: A list of context words. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data. If no context words are passed only a regular expression is checked."Tags"
: A list of tags applied to the custom entity type.
Main.Glue.create_data_quality_ruleset
— Methodcreate_data_quality_ruleset(name, ruleset)
create_data_quality_ruleset(name, ruleset, params::Dict{String,<:Any})
Creates a data quality ruleset with DQDL rules applied to a specified Glue table. You create the ruleset using the Data Quality Definition Language (DQDL). For more information, see the Glue developer guide.
Arguments
name
: A unique name for the data quality ruleset.ruleset
: A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"ClientToken"
: Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource."Description"
: A description of the data quality ruleset."Tags"
: A list of tags applied to the data quality ruleset."TargetTable"
: A target table associated with the data quality ruleset.
Main.Glue.create_database
— Methodcreate_database(database_input)
create_database(database_input, params::Dict{String,<:Any})
Creates a new database in a Data Catalog.
Arguments
database_input
: The metadata for the database.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which to create the database. If none is provided, the Amazon Web Services account ID is used by default."Tags"
: The tags you assign to the database.
Main.Glue.create_dev_endpoint
— Methodcreate_dev_endpoint(endpoint_name, role_arn)
create_dev_endpoint(endpoint_name, role_arn, params::Dict{String,<:Any})
Creates a new development endpoint.
Arguments
endpoint_name
: The name to be assigned to the new DevEndpoint.role_arn
: The IAM role for the DevEndpoint.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Arguments"
: A map of arguments used to configure the DevEndpoint."ExtraJarsS3Path"
: The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint."ExtraPythonLibsS3Path"
: The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma. You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported."GlueVersion"
: Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints. For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide. Development endpoints that are created without specifying a Glue version default to Glue 0.9. You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2."NumberOfNodes"
: The number of Glue Data Processing Units (DPUs) to allocate to this DevEndpoint."NumberOfWorkers"
: The number of workers of a defined workerType that are allocated to the development endpoint. The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X."PublicKey"
: The public key to be used by this DevEndpoint for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys."PublicKeys"
: A list of public keys to be used by the development endpoints for authentication. The use of this attribute is preferred over a single public key because the public keys allow you to have a different private key per client. If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the UpdateDevEndpoint API with the public key content in the deletePublicKeys attribute, and the list of new keys in the addPublicKeys attribute."SecurityConfiguration"
: The name of the SecurityConfiguration structure to be used with this DevEndpoint."SecurityGroupIds"
: Security group IDs for the security groups to be used by the new DevEndpoint."SubnetId"
: The subnet ID for the new DevEndpoint to use."Tags"
: The tags to use with this DevEndpoint. You may use tags to limit access to the DevEndpoint. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide."WorkerType"
: The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X. For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs. For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs. Known issue: when a development endpoint is created with the G.2X WorkerType configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.
Main.Glue.create_job
— Methodcreate_job(command, name, role)
create_job(command, name, role, params::Dict{String,<:Any})
Creates a new job definition.
Arguments
command
: The JobCommand that runs this job.name
: The name you assign to this job definition. It must be unique in your account.role
: The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AllocatedCapacity"
: This parameter is deprecated. Use MaxCapacity instead. The number of Glue data processing units (DPUs) to allocate to this Job. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page."CodeGenConfigurationNodes"
: The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based."Connections"
: The connections used for this job."DefaultArguments"
: The default arguments for every run of this job, specified as name-value pairs. You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes. Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job. For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide. For information about the arguments you can provide to this field when configuring Spark jobs, see the Special Parameters Used by Glue topic in the developer guide. For information about the arguments you can provide to this field when configuring Ray jobs, see Using job parameters in Ray jobs in the developer guide."Description"
: Description of the job being defined."ExecutionClass"
: Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources. The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary. Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs."ExecutionProperty"
: An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job."GlueVersion"
: In Spark jobs, GlueVersion determines the versions of Apache Spark and Python that Glue available in a job. The Python version indicates the version supported for jobs of type Spark. Ray jobs should set GlueVersion to 4.0 or greater. However, the versions of Ray, Python and additional libraries available in your Ray job are determined by the Runtime parameter of the Job command. For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide. Jobs that are created without specifying a Glue version default to Glue 0.9."JobMode"
: A mode that describes how a job was created. Valid values are: SCRIPT - The job was created using the Glue Studio script editor. VISUAL - The job was created using the Glue Studio visual editor. NOTEBOOK - The job was created using an interactive sessions notebook. When the JobMode field is missing or null, SCRIPT is assigned as the default value."LogUri"
: This field is reserved for future use."MaintenanceWindow"
: This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs. Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT."MaxCapacity"
: For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page. For Glue version 2.0+ jobs, you cannot specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers. Do not set MaxCapacity if using WorkerType and NumberOfWorkers. The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL job: When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU. When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation."MaxRetries"
: The maximum number of times to retry this job if it fails."NonOverridableArguments"
: Arguments for this job that are not overridden when providing job arguments in a job run, specified as name-value pairs."NotificationProperty"
: Specifies configuration properties of a job notification."NumberOfWorkers"
: The number of workers of a defined workerType that are allocated when a job runs."SecurityConfiguration"
: The name of the SecurityConfiguration structure to be used with this job."SourceControlDetails"
: The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository."Tags"
: The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide."Timeout"
: The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours) for batch jobs. Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days."WorkerType"
: The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs. For the G.1X worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.2X worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.4X worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm). For the G.8X worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X worker type. For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs. For the Z.2X worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
Main.Glue.create_mltransform
— Methodcreate_mltransform(input_record_tables, name, parameters, role)
create_mltransform(input_record_tables, name, parameters, role, params::Dict{String,<:Any})
Creates an Glue machine learning transform. This operation creates the transform and all the necessary parameters to train it. Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches transform) for deduplicating data. You can provide an optional Description, in addition to the parameters that you want to use for your algorithm. You must also specify certain parameters for the tasks that Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role, and optionally, AllocatedCapacity, Timeout, and MaxRetries. For more information, see Jobs.
Arguments
input_record_tables
: A list of Glue table definitions used by the transform.name
: The unique name that you give the transform when you create it.parameters
: The algorithmic parameters that are specific to the transform type used. Conditionally dependent on the transform type.role
: The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform. This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue. This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the machine learning transform that is being defined. The default is an empty string."GlueVersion"
: This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide."MaxCapacity"
: The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page. MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType. If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set. If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set. If WorkerType is set, then NumberOfWorkers is required (and vice versa). MaxCapacity and NumberOfWorkers must both be at least 1. When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only. When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only."MaxRetries"
: The maximum number of times to retry a task for this transform after a task run fails."NumberOfWorkers"
: The number of workers of a defined workerType that are allocated when this task runs. If WorkerType is set, then NumberOfWorkers is required (and vice versa)."Tags"
: The tags to use with this machine learning transform. You may use tags to limit access to the machine learning transform. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide."Timeout"
: The timeout of the task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours)."TransformEncryption"
: The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS."WorkerType"
: The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X. For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker. MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType. If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set. If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set. If WorkerType is set, then NumberOfWorkers is required (and vice versa). MaxCapacity and NumberOfWorkers must both be at least 1.
Main.Glue.create_partition
— Methodcreate_partition(database_name, partition_input, table_name)
create_partition(database_name, partition_input, table_name, params::Dict{String,<:Any})
Creates a new partition.
Arguments
database_name
: The name of the metadata database in which the partition is to be created.partition_input
: A PartitionInput structure defining the partition to be created.table_name
: The name of the metadata table in which the partition is to be created.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The Amazon Web Services account ID of the catalog in which the partition is to be created.
Main.Glue.create_partition_index
— Methodcreate_partition_index(database_name, partition_index, table_name)
create_partition_index(database_name, partition_index, table_name, params::Dict{String,<:Any})
Creates a specified partition index in an existing table.
Arguments
database_name
: Specifies the name of a database in which you want to create a partition index.partition_index
: Specifies a PartitionIndex structure to create a partition index in an existing table.table_name
: Specifies the name of a table in which you want to create a partition index.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The catalog ID where the table resides.
Main.Glue.create_registry
— Methodcreate_registry(registry_name)
create_registry(registry_name, params::Dict{String,<:Any})
Creates a new registry which may be used to hold a collection of schemas.
Arguments
registry_name
: Name of the registry to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the registry. If description is not provided, there will not be any default value for this."Tags"
: Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API.
Main.Glue.create_schema
— Methodcreate_schema(data_format, schema_name)
create_schema(data_format, schema_name, params::Dict{String,<:Any})
Creates a new schema set and registers the schema definition. Returns an error if the schema set already exists without actually registering the version. When the schema set is created, a version checkpoint will be set to the first version. Compatibility mode "DISABLED" restricts any additional schema versions from being added after the first schema version. For all other compatibility modes, validation of compatibility settings will be applied only from the second version onwards when the RegisterSchemaVersion API is used. When this API is called without a RegistryId, this will create an entry for a "default-registry" in the registry database tables, if it is not already present.
Arguments
data_format
: The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.schema_name
: Name of the schema to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Compatibility"
: The compatibility mode of the schema. The possible values are: NONE: No compatibility mode applies. You can use this choice in development scenarios or if you do not know the compatibility mode that you want to apply to schemas. Any new version added will be accepted without undergoing a compatibility check. DISABLED: This compatibility choice prevents versioning for a particular schema. You can use this choice to prevent future versioning of a schema. BACKWARD: This compatibility choice is recommended as it allows data receivers to read both the current and one previous schema version. This means that for instance, a new schema version cannot drop data fields or change the type of these fields, so they can't be read by readers using the previous version. BACKWARDALL: This compatibility choice allows data receivers to read both the current and all previous schema versions. You can use this choice when you need to delete fields or add optional fields, and check compatibility against all previous schema versions. FORWARD: This compatibility choice allows data receivers to read both the current and one next schema version, but not necessarily later versions. You can use this choice when you need to add fields or delete optional fields, but only check compatibility against the last schema version. FORWARDALL: This compatibility choice allows data receivers to read written by producers of any new registered schema. You can use this choice when you need to add fields or delete optional fields, and check compatibility against all previous schema versions. FULL: This compatibility choice allows data receivers to read data written by producers using the previous or next version of the schema, but not necessarily earlier or later versions. You can use this choice when you need to add or remove optional fields, but only check compatibility against the last schema version. FULL_ALL: This compatibility choice allows data receivers to read data written by producers using all previous schema versions. You can use this choice when you need to add or remove optional fields, and check compatibility against all previous schema versions."Description"
: An optional description of the schema. If description is not provided, there will not be any automatic default value for this."RegistryId"
: This is a wrapper shape to contain the registry identity fields. If this is not provided, the default registry will be used. The ARN format for the same will be: arn:aws:glue:us-east-2:<customer id>:registry/default-registry:random-5-letter-id."SchemaDefinition"
: The schema definition using the DataFormat setting for SchemaName."Tags"
: Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API. If specified, follows the Amazon Web Services tags-on-create pattern.
Main.Glue.create_script
— Methodcreate_script()
create_script(params::Dict{String,<:Any})
Transforms a directed acyclic graph (DAG) into code.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"DagEdges"
: A list of the edges in the DAG."DagNodes"
: A list of the nodes in the DAG."Language"
: The programming language of the resulting code from the DAG.
Main.Glue.create_security_configuration
— Methodcreate_security_configuration(encryption_configuration, name)
create_security_configuration(encryption_configuration, name, params::Dict{String,<:Any})
Creates a new security configuration. A security configuration is a set of security properties that can be used by Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints.
Arguments
encryption_configuration
: The encryption configuration for the new security configuration.name
: The name for the new security configuration.
Main.Glue.create_session
— Methodcreate_session(command, id, role)
create_session(command, id, role, params::Dict{String,<:Any})
Creates a new session.
Arguments
command
: The SessionCommand that runs the job.id
: The ID of the session request.role
: The IAM Role ARN
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Connections"
: The number of connections to use for the session."DefaultArguments"
: A map array of key-value pairs. Max is 75 pairs."Description"
: The description of the session."GlueVersion"
: The Glue version determines the versions of Apache Spark and Python that Glue supports. The GlueVersion must be greater than 2.0."IdleTimeout"
: The number of minutes when idle before session times out. Default for Spark ETL jobs is value of Timeout. Consult the documentation for other job types."MaxCapacity"
: The number of Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory."NumberOfWorkers"
: The number of workers of a defined WorkerType to use for the session."RequestOrigin"
: The origin of the request."SecurityConfiguration"
: The name of the SecurityConfiguration structure to be used with the session"Tags"
: The map of key value pairs (tags) belonging to the session."Timeout"
: The number of minutes before session times out. Default for Spark ETL jobs is 48 hours (2880 minutes), the maximum session lifetime for this job type. Consult the documentation for other job types."WorkerType"
: The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, or G.8X for Spark jobs. Accepts the value Z.2X for Ray notebooks. For the G.1X worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.2X worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.4X worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm). For the G.8X worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X worker type. For the Z.2X worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
Main.Glue.create_table
— Methodcreate_table(database_name, table_input)
create_table(database_name, table_input, params::Dict{String,<:Any})
Creates a new table definition in the Data Catalog.
Arguments
database_name
: The catalog database in which to create the new table. For Hive compatibility, this name is entirely lowercase.table_input
: The TableInput object that defines the metadata table to create in the catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which to create the Table. If none is supplied, the Amazon Web Services account ID is used by default."OpenTableFormatInput"
: Specifies an OpenTableFormatInput structure when creating an open format table."PartitionIndexes"
: A list of partition indexes, PartitionIndex structures, to create in the table."TransactionId"
: The ID of the transaction.
Main.Glue.create_table_optimizer
— Methodcreate_table_optimizer(catalog_id, database_name, table_name, table_optimizer_configuration, type)
create_table_optimizer(catalog_id, database_name, table_name, table_optimizer_configuration, type, params::Dict{String,<:Any})
Creates a new table optimizer for a specific function. compaction is the only currently supported optimizer type.
Arguments
catalog_id
: The Catalog ID of the table.database_name
: The name of the database in the catalog in which the table resides.table_name
: The name of the table.table_optimizer_configuration
: A TableOptimizerConfiguration object representing the configuration of a table optimizer.type
: The type of table optimizer. Currently, the only valid value is compaction.
Main.Glue.create_trigger
— Methodcreate_trigger(actions, name, type)
create_trigger(actions, name, type, params::Dict{String,<:Any})
Creates a new trigger.
Arguments
actions
: The actions initiated by this trigger when it fires.name
: The name of the trigger.type
: The type of the new trigger.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the new trigger."EventBatchingCondition"
: Batch condition that must be met (specified number of events received or batch time window expired) before EventBridge event trigger fires."Predicate"
: A predicate to specify when the new trigger should fire. This field is required when the trigger type is CONDITIONAL."Schedule"
: A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *). This field is required when the trigger type is SCHEDULED."StartOnCreation"
: Set to true to start SCHEDULED and CONDITIONAL triggers when created. True is not supported for ON_DEMAND triggers."Tags"
: The tags to use with this trigger. You may use tags to limit access to the trigger. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide."WorkflowName"
: The name of the workflow associated with the trigger.
Main.Glue.create_usage_profile
— Methodcreate_usage_profile(configuration, name)
create_usage_profile(configuration, name, params::Dict{String,<:Any})
Creates an Glue usage profile.
Arguments
configuration
: A ProfileConfiguration object specifying the job and session values for the profile.name
: The name of the usage profile.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the usage profile."Tags"
: A list of tags applied to the usage profile.
Main.Glue.create_user_defined_function
— Methodcreate_user_defined_function(database_name, function_input)
create_user_defined_function(database_name, function_input, params::Dict{String,<:Any})
Creates a new function definition in the Data Catalog.
Arguments
database_name
: The name of the catalog database in which to create the function.function_input
: A FunctionInput object that defines the function to create in the Data Catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which to create the function. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.create_workflow
— Methodcreate_workflow(name)
create_workflow(name, params::Dict{String,<:Any})
Creates a new workflow.
Arguments
name
: The name to be assigned to the workflow. It should be unique within your account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"DefaultRunProperties"
: A collection of properties to be used as part of each execution of the workflow."Description"
: A description of the workflow."MaxConcurrentRuns"
: You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs."Tags"
: The tags to be used with this workflow.
Main.Glue.delete_blueprint
— Methoddelete_blueprint(name)
delete_blueprint(name, params::Dict{String,<:Any})
Deletes an existing blueprint.
Arguments
name
: The name of the blueprint to delete.
Main.Glue.delete_classifier
— Methoddelete_classifier(name)
delete_classifier(name, params::Dict{String,<:Any})
Removes a classifier from the Data Catalog.
Arguments
name
: Name of the classifier to remove.
Main.Glue.delete_column_statistics_for_partition
— Methoddelete_column_statistics_for_partition(column_name, database_name, partition_values, table_name)
delete_column_statistics_for_partition(column_name, database_name, partition_values, table_name, params::Dict{String,<:Any})
Delete the partition column statistics of a column. The Identity and Access Management (IAM) permission required for this operation is DeletePartition.
Arguments
column_name
: Name of the column.database_name
: The name of the catalog database where the partitions reside.partition_values
: A list of partition values identifying the partition.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.delete_column_statistics_for_table
— Methoddelete_column_statistics_for_table(column_name, database_name, table_name)
delete_column_statistics_for_table(column_name, database_name, table_name, params::Dict{String,<:Any})
Retrieves table statistics of columns. The Identity and Access Management (IAM) permission required for this operation is DeleteTable.
Arguments
column_name
: The name of the column.database_name
: The name of the catalog database where the partitions reside.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.delete_connection
— Methoddelete_connection(connection_name)
delete_connection(connection_name, params::Dict{String,<:Any})
Deletes a connection from the Data Catalog.
Arguments
connection_name
: The name of the connection to delete.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.delete_crawler
— Methoddelete_crawler(name)
delete_crawler(name, params::Dict{String,<:Any})
Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.
Arguments
name
: The name of the crawler to remove.
Main.Glue.delete_custom_entity_type
— Methoddelete_custom_entity_type(name)
delete_custom_entity_type(name, params::Dict{String,<:Any})
Deletes a custom pattern by specifying its name.
Arguments
name
: The name of the custom pattern that you want to delete.
Main.Glue.delete_data_quality_ruleset
— Methoddelete_data_quality_ruleset(name)
delete_data_quality_ruleset(name, params::Dict{String,<:Any})
Deletes a data quality ruleset.
Arguments
name
: A name for the data quality ruleset.
Main.Glue.delete_database
— Methoddelete_database(name)
delete_database(name, params::Dict{String,<:Any})
Removes a specified database from a Data Catalog. After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service. To ensure the immediate deletion of all related resources, before calling DeleteDatabase, use DeleteTableVersion or BatchDeleteTableVersion, DeletePartition or BatchDeletePartition, DeleteUserDefinedFunction, and DeleteTable or BatchDeleteTable, to delete any resources that belong to the database.
Arguments
name
: The name of the database to delete. For Hive compatibility, this must be all lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.delete_dev_endpoint
— Methoddelete_dev_endpoint(endpoint_name)
delete_dev_endpoint(endpoint_name, params::Dict{String,<:Any})
Deletes a specified development endpoint.
Arguments
endpoint_name
: The name of the DevEndpoint.
Main.Glue.delete_job
— Methoddelete_job(job_name)
delete_job(job_name, params::Dict{String,<:Any})
Deletes a specified job definition. If the job definition is not found, no exception is thrown.
Arguments
job_name
: The name of the job definition to delete.
Main.Glue.delete_mltransform
— Methoddelete_mltransform(transform_id)
delete_mltransform(transform_id, params::Dict{String,<:Any})
Deletes an Glue machine learning transform. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms. However, any Glue jobs that still reference the deleted transform will no longer succeed.
Arguments
transform_id
: The unique identifier of the transform to delete.
Main.Glue.delete_partition
— Methoddelete_partition(database_name, partition_values, table_name)
delete_partition(database_name, partition_values, table_name, params::Dict{String,<:Any})
Deletes a specified partition.
Arguments
database_name
: The name of the catalog database in which the table in question resides.partition_values
: The values that define the partition.table_name
: The name of the table that contains the partition to be deleted.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.delete_partition_index
— Methoddelete_partition_index(database_name, index_name, table_name)
delete_partition_index(database_name, index_name, table_name, params::Dict{String,<:Any})
Deletes a specified partition index from an existing table.
Arguments
database_name
: Specifies the name of a database from which you want to delete a partition index.index_name
: The name of the partition index to be deleted.table_name
: Specifies the name of a table from which you want to delete a partition index.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The catalog ID where the table resides.
Main.Glue.delete_registry
— Methoddelete_registry(registry_id)
delete_registry(registry_id, params::Dict{String,<:Any})
Delete the entire registry including schema and all of its versions. To get the status of the delete operation, you can call the GetRegistry API after the asynchronous call. Deleting a registry will deactivate all online operations for the registry such as the UpdateRegistry, CreateSchema, UpdateSchema, and RegisterSchemaVersion APIs.
Arguments
registry_id
: This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Main.Glue.delete_resource_policy
— Methoddelete_resource_policy()
delete_resource_policy(params::Dict{String,<:Any})
Deletes a specified policy.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"PolicyHashCondition"
: The hash value returned when this policy was set."ResourceArn"
: The ARN of the Glue resource for the resource policy to be deleted.
Main.Glue.delete_schema
— Methoddelete_schema(schema_id)
delete_schema(schema_id, params::Dict{String,<:Any})
Deletes the entire schema set, including the schema set and all of its versions. To get the status of the delete operation, you can call GetSchema API after the asynchronous call. Deleting a registry will deactivate all online operations for the schema, such as the GetSchemaByDefinition, and RegisterSchemaVersion APIs.
Arguments
schema_id
: This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
Main.Glue.delete_schema_versions
— Methoddelete_schema_versions(schema_id, versions)
delete_schema_versions(schema_id, versions, params::Dict{String,<:Any})
Remove versions from the specified schema. A version number or range may be supplied. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDSFULL, an error is returned. Calling the GetSchemaVersions API after this call will list the status of the deleted versions. When the range of version numbers contain check pointed version, the API will return a 409 conflict and will not proceed with the deletion. You have to remove the checkpoint first using the DeleteSchemaCheckpoint API before using this API. You cannot use the DeleteSchemaVersions API to delete the first schema version in the schema set. The first schema version can only be deleted by the DeleteSchema API. This operation will also delete the attached SchemaVersionMetadata under the schema versions. Hard deletes will be enforced on the database. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDSFULL, an error is returned.
Arguments
schema_id
: This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).versions
: A version range may be supplied which may be of the format: a single version number, 5 a range, 5-8 : deletes versions 5, 6, 7, 8
Main.Glue.delete_security_configuration
— Methoddelete_security_configuration(name)
delete_security_configuration(name, params::Dict{String,<:Any})
Deletes a specified security configuration.
Arguments
name
: The name of the security configuration to delete.
Main.Glue.delete_session
— Methoddelete_session(id)
delete_session(id, params::Dict{String,<:Any})
Deletes the session.
Arguments
id
: The ID of the session to be deleted.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The name of the origin of the delete session request.
Main.Glue.delete_table
— Methoddelete_table(database_name, name)
delete_table(database_name, name, params::Dict{String,<:Any})
Removes a table definition from the Data Catalog. After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service. To ensure the immediate deletion of all related resources, before calling DeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.
Arguments
database_name
: The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.name
: The name of the table to be deleted. For Hive compatibility, this name is entirely lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default."TransactionId"
: The transaction ID at which to delete the table contents.
Main.Glue.delete_table_optimizer
— Methoddelete_table_optimizer(catalog_id, database_name, table_name, type)
delete_table_optimizer(catalog_id, database_name, table_name, type, params::Dict{String,<:Any})
Deletes an optimizer and all associated metadata for a table. The optimization will no longer be performed on the table.
Arguments
catalog_id
: The Catalog ID of the table.database_name
: The name of the database in the catalog in which the table resides.table_name
: The name of the table.type
: The type of table optimizer.
Main.Glue.delete_table_version
— Methoddelete_table_version(database_name, table_name, version_id)
delete_table_version(database_name, table_name, version_id, params::Dict{String,<:Any})
Deletes a specified version of a table.
Arguments
database_name
: The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.table_name
: The name of the table. For Hive compatibility, this name is entirely lowercase.version_id
: The ID of the table version to be deleted. A VersionID is a string representation of an integer. Each version is incremented by 1.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.delete_trigger
— Methoddelete_trigger(name)
delete_trigger(name, params::Dict{String,<:Any})
Deletes a specified trigger. If the trigger is not found, no exception is thrown.
Arguments
name
: The name of the trigger to delete.
Main.Glue.delete_usage_profile
— Methoddelete_usage_profile(name)
delete_usage_profile(name, params::Dict{String,<:Any})
Deletes the Glue specified usage profile.
Arguments
name
: The name of the usage profile to delete.
Main.Glue.delete_user_defined_function
— Methoddelete_user_defined_function(database_name, function_name)
delete_user_defined_function(database_name, function_name, params::Dict{String,<:Any})
Deletes an existing function definition from the Data Catalog.
Arguments
database_name
: The name of the catalog database where the function is located.function_name
: The name of the function definition to be deleted.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.delete_workflow
— Methoddelete_workflow(name)
delete_workflow(name, params::Dict{String,<:Any})
Deletes a workflow.
Arguments
name
: Name of the workflow to be deleted.
Main.Glue.get_blueprint
— Methodget_blueprint(name)
get_blueprint(name, params::Dict{String,<:Any})
Retrieves the details of a blueprint.
Arguments
name
: The name of the blueprint.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeBlueprint"
: Specifies whether or not to include the blueprint in the response."IncludeParameterSpec"
: Specifies whether or not to include the parameter specification.
Main.Glue.get_blueprint_run
— Methodget_blueprint_run(blueprint_name, run_id)
get_blueprint_run(blueprint_name, run_id, params::Dict{String,<:Any})
Retrieves the details of a blueprint run.
Arguments
blueprint_name
: The name of the blueprint.run_id
: The run ID for the blueprint run you want to retrieve.
Main.Glue.get_blueprint_runs
— Methodget_blueprint_runs(blueprint_name)
get_blueprint_runs(blueprint_name, params::Dict{String,<:Any})
Retrieves the details of blueprint runs for a specified blueprint.
Arguments
blueprint_name
: The name of the blueprint.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request.
Main.Glue.get_catalog_import_status
— Methodget_catalog_import_status()
get_catalog_import_status(params::Dict{String,<:Any})
Retrieves the status of a migration operation.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the catalog to migrate. Currently, this should be the Amazon Web Services account ID.
Main.Glue.get_classifier
— Methodget_classifier(name)
get_classifier(name, params::Dict{String,<:Any})
Retrieve a classifier by name.
Arguments
name
: Name of the classifier to retrieve.
Main.Glue.get_classifiers
— Methodget_classifiers()
get_classifiers(params::Dict{String,<:Any})
Lists all classifier objects in the Data Catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The size of the list to return (optional)."NextToken"
: An optional continuation token.
Main.Glue.get_column_statistics_for_partition
— Methodget_column_statistics_for_partition(column_names, database_name, partition_values, table_name)
get_column_statistics_for_partition(column_names, database_name, partition_values, table_name, params::Dict{String,<:Any})
Retrieves partition statistics of columns. The Identity and Access Management (IAM) permission required for this operation is GetPartition.
Arguments
column_names
: A list of the column names.database_name
: The name of the catalog database where the partitions reside.partition_values
: A list of partition values identifying the partition.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.get_column_statistics_for_table
— Methodget_column_statistics_for_table(column_names, database_name, table_name)
get_column_statistics_for_table(column_names, database_name, table_name, params::Dict{String,<:Any})
Retrieves table statistics of columns. The Identity and Access Management (IAM) permission required for this operation is GetTable.
Arguments
column_names
: A list of the column names.database_name
: The name of the catalog database where the partitions reside.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.get_column_statistics_task_run
— Methodget_column_statistics_task_run(column_statistics_task_run_id)
get_column_statistics_task_run(column_statistics_task_run_id, params::Dict{String,<:Any})
Get the associated metadata/information for a task run, given a task run ID.
Arguments
column_statistics_task_run_id
: The identifier for the particular column statistics task run.
Main.Glue.get_column_statistics_task_runs
— Methodget_column_statistics_task_runs(database_name, table_name)
get_column_statistics_task_runs(database_name, table_name, params::Dict{String,<:Any})
Retrieves information about all runs associated with the specified table.
Arguments
database_name
: The name of the database where the table resides.table_name
: The name of the table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of the response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_connection
— Methodget_connection(name)
get_connection(name, params::Dict{String,<:Any})
Retrieves a connection definition from the Data Catalog.
Arguments
name
: The name of the connection definition to retrieve.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default."HidePassword"
: Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.
Main.Glue.get_connections
— Methodget_connections()
get_connections(params::Dict{String,<:Any})
Retrieves a list of connection definitions from the Data Catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default."Filter"
: A filter that controls which connections are returned."HidePassword"
: Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties."MaxResults"
: The maximum number of connections to return in one response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_crawler
— Methodget_crawler(name)
get_crawler(name, params::Dict{String,<:Any})
Retrieves metadata for a specified crawler.
Arguments
name
: The name of the crawler to retrieve metadata for.
Main.Glue.get_crawler_metrics
— Methodget_crawler_metrics()
get_crawler_metrics(params::Dict{String,<:Any})
Retrieves metrics about specified crawlers.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CrawlerNameList"
: A list of the names of crawlers about which to retrieve metrics."MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_crawlers
— Methodget_crawlers()
get_crawlers(params::Dict{String,<:Any})
Retrieves metadata for all crawlers defined in the customer account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The number of crawlers to return on each call."NextToken"
: A continuation token, if this is a continuation request.
Main.Glue.get_custom_entity_type
— Methodget_custom_entity_type(name)
get_custom_entity_type(name, params::Dict{String,<:Any})
Retrieves the details of a custom pattern by specifying its name.
Arguments
name
: The name of the custom pattern that you want to retrieve.
Main.Glue.get_data_catalog_encryption_settings
— Methodget_data_catalog_encryption_settings()
get_data_catalog_encryption_settings(params::Dict{String,<:Any})
Retrieves the security configuration for a specified catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog to retrieve the security configuration for. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.get_data_quality_result
— Methodget_data_quality_result(result_id)
get_data_quality_result(result_id, params::Dict{String,<:Any})
Retrieves the result of a data quality rule evaluation.
Arguments
result_id
: A unique result ID for the data quality result.
Main.Glue.get_data_quality_rule_recommendation_run
— Methodget_data_quality_rule_recommendation_run(run_id)
get_data_quality_rule_recommendation_run(run_id, params::Dict{String,<:Any})
Gets the specified recommendation run that was used to generate rules.
Arguments
run_id
: The unique run identifier associated with this run.
Main.Glue.get_data_quality_ruleset
— Methodget_data_quality_ruleset(name)
get_data_quality_ruleset(name, params::Dict{String,<:Any})
Returns an existing ruleset by identifier or name.
Arguments
name
: The name of the ruleset.
Main.Glue.get_data_quality_ruleset_evaluation_run
— Methodget_data_quality_ruleset_evaluation_run(run_id)
get_data_quality_ruleset_evaluation_run(run_id, params::Dict{String,<:Any})
Retrieves a specific run where a ruleset is evaluated against a data source.
Arguments
run_id
: The unique run identifier associated with this run.
Main.Glue.get_database
— Methodget_database(name)
get_database(name, params::Dict{String,<:Any})
Retrieves the definition of a specified database.
Arguments
name
: The name of the database to retrieve. For Hive compatibility, this should be all lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.get_databases
— Methodget_databases()
get_databases(params::Dict{String,<:Any})
Retrieves all databases defined in a given Data Catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog from which to retrieve Databases. If none is provided, the Amazon Web Services account ID is used by default."MaxResults"
: The maximum number of databases to return in one response."NextToken"
: A continuation token, if this is a continuation call."ResourceShareType"
: Allows you to specify that you want to list the databases shared with your account. The allowable values are FEDERATED, FOREIGN or ALL. If set to FEDERATED, will list the federated databases (referencing an external entity) shared with your account. If set to FOREIGN, will list the databases shared with your account. If set to ALL, will list the databases shared with your account, as well as the databases in yor local account.
Main.Glue.get_dataflow_graph
— Methodget_dataflow_graph()
get_dataflow_graph(params::Dict{String,<:Any})
Transforms a Python script into a directed acyclic graph (DAG).
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"PythonScript"
: The Python script to transform.
Main.Glue.get_dev_endpoint
— Methodget_dev_endpoint(endpoint_name)
get_dev_endpoint(endpoint_name, params::Dict{String,<:Any})
Retrieves information about a specified development endpoint. When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.
Arguments
endpoint_name
: Name of the DevEndpoint to retrieve information for.
Main.Glue.get_dev_endpoints
— Methodget_dev_endpoints()
get_dev_endpoints(params::Dict{String,<:Any})
Retrieves all the development endpoints in this Amazon Web Services account. When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of information to return."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_job
— Methodget_job(job_name)
get_job(job_name, params::Dict{String,<:Any})
Retrieves an existing job definition.
Arguments
job_name
: The name of the job definition to retrieve.
Main.Glue.get_job_bookmark
— Methodget_job_bookmark(job_name)
get_job_bookmark(job_name, params::Dict{String,<:Any})
Returns information on a job bookmark entry. For more information about enabling and using job bookmarks, see: Tracking processed data using job bookmarks Job parameters used by Glue Job structure
Arguments
job_name
: The name of the job in question.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RunId"
: The unique run identifier associated with this job run.
Main.Glue.get_job_run
— Methodget_job_run(job_name, run_id)
get_job_run(job_name, run_id, params::Dict{String,<:Any})
Retrieves the metadata for a given job run. Job run history is accessible for 90 days for your workflow and job run.
Arguments
job_name
: Name of the job definition being run.run_id
: The ID of the job run.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"PredecessorsIncluded"
: True if a list of predecessor runs should be returned.
Main.Glue.get_job_runs
— Methodget_job_runs(job_name)
get_job_runs(job_name, params::Dict{String,<:Any})
Retrieves metadata for all runs of a given job definition.
Arguments
job_name
: The name of the job definition for which to retrieve all job runs.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of the response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_jobs
— Methodget_jobs()
get_jobs(params::Dict{String,<:Any})
Retrieves all current job definitions.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of the response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_mapping
— Methodget_mapping(source)
get_mapping(source, params::Dict{String,<:Any})
Creates mappings.
Arguments
source
: Specifies the source table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Location"
: Parameters for the mapping."Sinks"
: A list of target tables.
Main.Glue.get_mltask_run
— Methodget_mltask_run(task_run_id, transform_id)
get_mltask_run(task_run_id, transform_id, params::Dict{String,<:Any})
Gets details for a specific task run on a machine learning transform. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by calling GetMLTaskRun with the TaskRunID and its parent transform's TransformID.
Arguments
task_run_id
: The unique identifier of the task run.transform_id
: The unique identifier of the machine learning transform.
Main.Glue.get_mltask_runs
— Methodget_mltask_runs(transform_id)
get_mltask_runs(transform_id, params::Dict{String,<:Any})
Gets a list of runs for a machine learning transform. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can get a sortable, filterable list of machine learning task runs by calling GetMLTaskRuns with their parent transform's TransformID and other optional parameters as documented in this section. This operation returns a list of historic runs and must be paginated.
Arguments
transform_id
: The unique identifier of the machine learning transform.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter criteria, in the TaskRunFilterCriteria structure, for the task run."MaxResults"
: The maximum number of results to return."NextToken"
: A token for pagination of the results. The default is empty."Sort"
: The sorting criteria, in the TaskRunSortCriteria structure, for the task run.
Main.Glue.get_mltransform
— Methodget_mltransform(transform_id)
get_mltransform(transform_id, params::Dict{String,<:Any})
Gets an Glue machine learning transform artifact and all its corresponding metadata. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. You can retrieve their metadata by calling GetMLTransform.
Arguments
transform_id
: The unique identifier of the transform, generated at the time that the transform was created.
Main.Glue.get_mltransforms
— Methodget_mltransforms()
get_mltransforms(params::Dict{String,<:Any})
Gets a sortable, filterable list of existing Glue machine learning transforms. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue, and you can retrieve their metadata by calling GetMLTransforms.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter transformation criteria."MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results."Sort"
: The sorting criteria.
Main.Glue.get_partition
— Methodget_partition(database_name, partition_values, table_name)
get_partition(database_name, partition_values, table_name, params::Dict{String,<:Any})
Retrieves information about a specified partition.
Arguments
database_name
: The name of the catalog database where the partition resides.partition_values
: The values that define the partition.table_name
: The name of the partition's table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partition in question resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.get_partition_indexes
— Methodget_partition_indexes(database_name, table_name)
get_partition_indexes(database_name, table_name, params::Dict{String,<:Any})
Retrieves the partition indexes associated with a table.
Arguments
database_name
: Specifies the name of a database from which you want to retrieve partition indexes.table_name
: Specifies the name of a table for which you want to retrieve the partition indexes.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The catalog ID where the table resides."NextToken"
: A continuation token, included if this is a continuation call.
Main.Glue.get_partitions
— Methodget_partitions(database_name, table_name)
get_partitions(database_name, table_name, params::Dict{String,<:Any})
Retrieves information about the partitions in a table.
Arguments
database_name
: The name of the catalog database where the partitions reside.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is provided, the Amazon Web Services account ID is used by default."ExcludeColumnSchema"
: When true, specifies not returning the partition column schema. Useful when you are interested only in other partition attributes such as partition values or location. This approach avoids the problem of a large response by not returning duplicate data."Expression"
: An expression that filters the partitions to be returned. The expression uses SQL syntax similar to the SQL WHERE filter clause. The SQL statement parser JSQLParser parses the expression. Operators: The following are the operators that you can use in the Expression API call: = Checks whether the values of the two operands are equal; if yes, then the condition becomes true. Example: Assume 'variable a' holds 10 and 'variable b' holds 20. (a = b) is not true. < > Checks whether the values of two operands are equal; if the values are not equal, then the condition becomes true. Example: (a < > b) is true. > Checks whether the value of the left operand is greater than the value of the right operand; if yes, then the condition becomes true. Example: (a > b) is not true. < Checks whether the value of the left operand is less than the value of the right operand; if yes, then the condition becomes true. Example: (a < b) is true. >= Checks whether the value of the left operand is greater than or equal to the value of the right operand; if yes, then the condition becomes true. Example: (a >= b) is not true. <= Checks whether the value of the left operand is less than or equal to the value of the right operand; if yes, then the condition becomes true. Example: (a <= b) is true. AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL Logical operators. Supported Partition Key Types: The following are the supported partition keys. string date timestamp int bigint long tinyint smallint decimal If an type is encountered that is not valid, an exception is thrown. The following list shows the valid operators on each type. When you define a crawler, the partitionKey type is created as a STRING, to be compatible with the catalog partitions. Sample API Call:"MaxResults"
: The maximum number of partitions to return in a single response."NextToken"
: A continuation token, if this is not the first call to retrieve these partitions."QueryAsOfTime"
: The time as of when to read the partition contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId."Segment"
: The segment of the table's partitions to scan in this request."TransactionId"
: The transaction ID at which to read the partition contents.
Main.Glue.get_plan
— Methodget_plan(mapping, source)
get_plan(mapping, source, params::Dict{String,<:Any})
Gets code to perform a specified mapping.
Arguments
mapping
: The list of mappings from a source table to target tables.source
: The source table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AdditionalPlanOptionsMap"
: A map to hold additional optional key-value parameters. Currently, these key-value pairs are supported: inferSchema — Specifies whether to set inferSchema to true or false for the default script generated by an Glue job. For example, to set inferSchema to true, pass the following key value pair: –additional-plan-options-map '{"inferSchema":"true"}'"Language"
: The programming language of the code to perform the mapping."Location"
: The parameters for the mapping."Sinks"
: The target tables.
Main.Glue.get_registry
— Methodget_registry(registry_id)
get_registry(registry_id, params::Dict{String,<:Any})
Describes the specified registry in detail.
Arguments
registry_id
: This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Main.Glue.get_resource_policies
— Methodget_resource_policies()
get_resource_policies(params::Dict{String,<:Any})
Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants. Also retrieves the Data Catalog resource policy. If you enabled metadata encryption in Data Catalog settings, and you do not have permission on the KMS key, the operation can't return the Data Catalog resource policy.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request.
Main.Glue.get_resource_policy
— Methodget_resource_policy()
get_resource_policy(params::Dict{String,<:Any})
Retrieves a specified resource policy.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"ResourceArn"
: The ARN of the Glue resource for which to retrieve the resource policy. If not supplied, the Data Catalog resource policy is returned. Use GetResourcePolicies to view all existing resource policies. For more information see Specifying Glue Resource ARNs.
Main.Glue.get_schema
— Methodget_schema(schema_id)
get_schema(schema_id, params::Dict{String,<:Any})
Describes the specified schema in detail.
Arguments
schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided. SchemaIdSchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
Main.Glue.get_schema_by_definition
— Methodget_schema_by_definition(schema_definition, schema_id)
get_schema_by_definition(schema_definition, schema_id, params::Dict{String,<:Any})
Retrieves a schema by the SchemaDefinition. The schema definition is sent to the Schema Registry, canonicalized, and hashed. If the hash is matched within the scope of the SchemaName or ARN (or the default registry, if none is supplied), that schema’s metadata is returned. Otherwise, a 404 or NotFound error is returned. Schema versions in Deleted statuses will not be included in the results.
Arguments
schema_definition
: The definition of the schema for which schema details are required.schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided. SchemaIdSchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.
Main.Glue.get_schema_version
— Methodget_schema_version()
get_schema_version(params::Dict{String,<:Any})
Get the specified schema by its unique ID assigned when a version of the schema is created or registered. Schema versions in Deleted status will not be included in the results.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"SchemaId"
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided. SchemaIdSchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided."SchemaVersionId"
: The SchemaVersionId of the schema version. This field is required for fetching by schema ID. Either this or the SchemaId wrapper has to be provided."SchemaVersionNumber"
: The version number of the schema.
Main.Glue.get_schema_versions_diff
— Methodget_schema_versions_diff(first_schema_version_number, schema_diff_type, schema_id, second_schema_version_number)
get_schema_versions_diff(first_schema_version_number, schema_diff_type, schema_id, second_schema_version_number, params::Dict{String,<:Any})
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry. This API allows you to compare two schema versions between two schema definitions under the same schema.
Arguments
first_schema_version_number
: The first of the two schema versions to be compared.schema_diff_type
: Refers to SYNTAX_DIFF, which is the currently supported diff type.schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided. SchemaIdSchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.second_schema_version_number
: The second of the two schema versions to be compared.
Main.Glue.get_security_configuration
— Methodget_security_configuration(name)
get_security_configuration(name, params::Dict{String,<:Any})
Retrieves a specified security configuration.
Arguments
name
: The name of the security configuration to retrieve.
Main.Glue.get_security_configurations
— Methodget_security_configurations()
get_security_configurations(params::Dict{String,<:Any})
Retrieves a list of all security configurations.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum number of results to return."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_session
— Methodget_session(id)
get_session(id, params::Dict{String,<:Any})
Retrieves the session.
Arguments
id
: The ID of the session.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The origin of the request.
Main.Glue.get_statement
— Methodget_statement(id, session_id)
get_statement(id, session_id, params::Dict{String,<:Any})
Retrieves the statement.
Arguments
id
: The Id of the statement.session_id
: The Session ID of the statement.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The origin of the request.
Main.Glue.get_table
— Methodget_table(database_name, name)
get_table(database_name, name, params::Dict{String,<:Any})
Retrieves the Table definition in a Data Catalog for a specified table.
Arguments
database_name
: The name of the database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.name
: The name of the table for which to retrieve the definition. For Hive compatibility, this name is entirely lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default."QueryAsOfTime"
: The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId."TransactionId"
: The transaction ID at which to read the table contents.
Main.Glue.get_table_optimizer
— Methodget_table_optimizer(catalog_id, database_name, table_name, type)
get_table_optimizer(catalog_id, database_name, table_name, type, params::Dict{String,<:Any})
Returns the configuration of all optimizers associated with a specified table.
Arguments
catalog_id
: The Catalog ID of the table.database_name
: The name of the database in the catalog in which the table resides.table_name
: The name of the table.type
: The type of table optimizer.
Main.Glue.get_table_version
— Methodget_table_version(database_name, table_name)
get_table_version(database_name, table_name, params::Dict{String,<:Any})
Retrieves a specified version of a table.
Arguments
database_name
: The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.table_name
: The name of the table. For Hive compatibility, this name is entirely lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default."VersionId"
: The ID value of the table version to be retrieved. A VersionID is a string representation of an integer. Each version is incremented by 1.
Main.Glue.get_table_versions
— Methodget_table_versions(database_name, table_name)
get_table_versions(database_name, table_name, params::Dict{String,<:Any})
Retrieves a list of strings that identify available versions of a specified table.
Arguments
database_name
: The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.table_name
: The name of the table. For Hive compatibility, this name is entirely lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default."MaxResults"
: The maximum number of table versions to return in one response."NextToken"
: A continuation token, if this is not the first call.
Main.Glue.get_tables
— Methodget_tables(database_name)
get_tables(database_name, params::Dict{String,<:Any})
Retrieves the definitions of some or all of the tables in a given Database.
Arguments
database_name
: The database in the catalog whose tables to list. For Hive compatibility, this name is entirely lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default."Expression"
: A regular expression pattern. If present, only those tables whose names match the pattern are returned."MaxResults"
: The maximum number of tables to return in a single response."NextToken"
: A continuation token, included if this is a continuation call."QueryAsOfTime"
: The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId."TransactionId"
: The transaction ID at which to read the table contents.
Main.Glue.get_tags
— Methodget_tags(resource_arn)
get_tags(resource_arn, params::Dict{String,<:Any})
Retrieves a list of tags associated with a resource.
Arguments
resource_arn
: The Amazon Resource Name (ARN) of the resource for which to retrieve tags.
Main.Glue.get_trigger
— Methodget_trigger(name)
get_trigger(name, params::Dict{String,<:Any})
Retrieves the definition of a trigger.
Arguments
name
: The name of the trigger to retrieve.
Main.Glue.get_triggers
— Methodget_triggers()
get_triggers(params::Dict{String,<:Any})
Gets all the triggers associated with a job.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"DependentJobName"
: The name of the job to retrieve triggers for. The trigger that can start this job is returned, and if there is no such trigger, all triggers are returned."MaxResults"
: The maximum size of the response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_unfiltered_partition_metadata
— Methodget_unfiltered_partition_metadata(catalog_id, database_name, partition_values, supported_permission_types, table_name)
get_unfiltered_partition_metadata(catalog_id, database_name, partition_values, supported_permission_types, table_name, params::Dict{String,<:Any})
Retrieves partition metadata from the Data Catalog that contains unfiltered metadata. For IAM authorization, the public IAM action associated with this API is glue:GetPartition.
Arguments
catalog_id
: The catalog ID where the partition resides.database_name
: (Required) Specifies the name of a database that contains the partition.partition_values
: (Required) A list of partition key values.supported_permission_types
: (Required) A list of supported permission types.table_name
: (Required) Specifies the name of a table that contains the partition.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AuditContext"
: A structure containing Lake Formation audit context information."QuerySessionContext"
: A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context."Region"
: Specified only if the base tables belong to a different Amazon Web Services Region.
Main.Glue.get_unfiltered_partitions_metadata
— Methodget_unfiltered_partitions_metadata(catalog_id, database_name, supported_permission_types, table_name)
get_unfiltered_partitions_metadata(catalog_id, database_name, supported_permission_types, table_name, params::Dict{String,<:Any})
Retrieves partition metadata from the Data Catalog that contains unfiltered metadata. For IAM authorization, the public IAM action associated with this API is glue:GetPartitions.
Arguments
catalog_id
: The ID of the Data Catalog where the partitions in question reside. If none is provided, the AWS account ID is used by default.database_name
: The name of the catalog database where the partitions reside.supported_permission_types
: A list of supported permission types.table_name
: The name of the table that contains the partition.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AuditContext"
: A structure containing Lake Formation audit context information."Expression"
: An expression that filters the partitions to be returned. The expression uses SQL syntax similar to the SQL WHERE filter clause. The SQL statement parser JSQLParser parses the expression. Operators: The following are the operators that you can use in the Expression API call: = Checks whether the values of the two operands are equal; if yes, then the condition becomes true. Example: Assume 'variable a' holds 10 and 'variable b' holds 20. (a = b) is not true. < > Checks whether the values of two operands are equal; if the values are not equal, then the condition becomes true. Example: (a < > b) is true. > Checks whether the value of the left operand is greater than the value of the right operand; if yes, then the condition becomes true. Example: (a > b) is not true. < Checks whether the value of the left operand is less than the value of the right operand; if yes, then the condition becomes true. Example: (a < b) is true. >= Checks whether the value of the left operand is greater than or equal to the value of the right operand; if yes, then the condition becomes true. Example: (a >= b) is not true. <= Checks whether the value of the left operand is less than or equal to the value of the right operand; if yes, then the condition becomes true. Example: (a <= b) is true. AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL Logical operators. Supported Partition Key Types: The following are the supported partition keys. string date timestamp int bigint long tinyint smallint decimal If an type is encountered that is not valid, an exception is thrown."MaxResults"
: The maximum number of partitions to return in a single response."NextToken"
: A continuation token, if this is not the first call to retrieve these partitions."QuerySessionContext"
: A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context."Region"
: Specified only if the base tables belong to a different Amazon Web Services Region."Segment"
: The segment of the table's partitions to scan in this request.
Main.Glue.get_unfiltered_table_metadata
— Methodget_unfiltered_table_metadata(catalog_id, database_name, name, supported_permission_types)
get_unfiltered_table_metadata(catalog_id, database_name, name, supported_permission_types, params::Dict{String,<:Any})
Allows a third-party analytical engine to retrieve unfiltered table metadata from the Data Catalog. For IAM authorization, the public IAM action associated with this API is glue:GetTable.
Arguments
catalog_id
: The catalog ID where the table resides.database_name
: (Required) Specifies the name of a database that contains the table.name
: (Required) Specifies the name of a table for which you are requesting metadata.supported_permission_types
: Indicates the level of filtering a third-party analytical engine is capable of enforcing when calling the GetUnfilteredTableMetadata API operation. Accepted values are: COLUMNPERMISSION - Column permissions ensure that users can access only specific columns in the table. If there are particular columns contain sensitive data, data lake administrators can define column filters that exclude access to specific columns. CELLFILTERPERMISSION - Cell-level filtering combines column filtering (include or exclude columns) and row filter expressions to restrict access to individual elements in the table. NESTEDPERMISSION - Nested permissions combines cell-level filtering and nested column filtering to restrict access to columns and/or nested columns in specific rows based on row filter expressions. NESTEDCELLPERMISSION - Nested cell permissions combines nested permission with nested cell-level filtering. This allows different subsets of nested columns to be restricted based on an array of row filter expressions. Note: Each of these permission types follows a hierarchical order where each subsequent permission type includes all permission of the previous type. Important: If you provide a supported permission type that doesn't match the user's level of permissions on the table, then Lake Formation raises an exception. For example, if the third-party engine calling the GetUnfilteredTableMetadata operation can enforce only column-level filtering, and the user has nested cell filtering applied on the table, Lake Formation throws an exception, and will not return unfiltered table metadata and data access credentials.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AuditContext"
: A structure containing Lake Formation audit context information."ParentResourceArn"
: The resource ARN of the view."Permissions"
: The Lake Formation data permissions of the caller on the table. Used to authorize the call when no view context is found."QuerySessionContext"
: A structure used as a protocol between query engines and Lake Formation or Glue. Contains both a Lake Formation generated authorization identifier and information from the request's authorization context."Region"
: Specified only if the base tables belong to a different Amazon Web Services Region."RootResourceArn"
: The resource ARN of the root view in a chain of nested views."SupportedDialect"
: A structure specifying the dialect and dialect version used by the query engine.
Main.Glue.get_usage_profile
— Methodget_usage_profile(name)
get_usage_profile(name, params::Dict{String,<:Any})
Retrieves information about the specified Glue usage profile.
Arguments
name
: The name of the usage profile to retrieve.
Main.Glue.get_user_defined_function
— Methodget_user_defined_function(database_name, function_name)
get_user_defined_function(database_name, function_name, params::Dict{String,<:Any})
Retrieves a specified function definition from the Data Catalog.
Arguments
database_name
: The name of the catalog database where the function is located.function_name
: The name of the function.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the function to be retrieved is located. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.get_user_defined_functions
— Methodget_user_defined_functions(pattern)
get_user_defined_functions(pattern, params::Dict{String,<:Any})
Retrieves multiple function definitions from the Data Catalog.
Arguments
pattern
: An optional function-name pattern string that filters the function definitions returned.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the functions to be retrieved are located. If none is provided, the Amazon Web Services account ID is used by default."DatabaseName"
: The name of the catalog database where the functions are located. If none is provided, functions from all the databases across the catalog will be returned."MaxResults"
: The maximum number of functions to return in one response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.get_workflow
— Methodget_workflow(name)
get_workflow(name, params::Dict{String,<:Any})
Retrieves resource metadata for a workflow.
Arguments
name
: The name of the workflow to retrieve.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeGraph"
: Specifies whether to include a graph when returning the workflow resource metadata.
Main.Glue.get_workflow_run
— Methodget_workflow_run(name, run_id)
get_workflow_run(name, run_id, params::Dict{String,<:Any})
Retrieves the metadata for a given workflow run. Job run history is accessible for 90 days for your workflow and job run.
Arguments
name
: Name of the workflow being run.run_id
: The ID of the workflow run.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeGraph"
: Specifies whether to include the workflow graph in response or not.
Main.Glue.get_workflow_run_properties
— Methodget_workflow_run_properties(name, run_id)
get_workflow_run_properties(name, run_id, params::Dict{String,<:Any})
Retrieves the workflow run properties which were set during the run.
Arguments
name
: Name of the workflow which was run.run_id
: The ID of the workflow run whose run properties should be returned.
Main.Glue.get_workflow_runs
— Methodget_workflow_runs(name)
get_workflow_runs(name, params::Dict{String,<:Any})
Retrieves metadata for all runs of a given workflow.
Arguments
name
: Name of the workflow whose metadata of runs should be returned.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"IncludeGraph"
: Specifies whether to include the workflow graph in response or not."MaxResults"
: The maximum number of workflow runs to be included in the response."NextToken"
: The maximum size of the response.
Main.Glue.import_catalog_to_glue
— Methodimport_catalog_to_glue()
import_catalog_to_glue(params::Dict{String,<:Any})
Imports an existing Amazon Athena Data Catalog to Glue.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the catalog to import. Currently, this should be the Amazon Web Services account ID.
Main.Glue.list_blueprints
— Methodlist_blueprints()
list_blueprints(params::Dict{String,<:Any})
Lists all the blueprint names in an account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Tags"
: Filters the list by an Amazon Web Services resource tag.
Main.Glue.list_column_statistics_task_runs
— Methodlist_column_statistics_task_runs()
list_column_statistics_task_runs(params::Dict{String,<:Any})
List all task runs for a particular account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of the response."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.list_crawlers
— Methodlist_crawlers()
list_crawlers(params::Dict{String,<:Any})
Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names. This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Tags"
: Specifies to return only these tagged resources.
Main.Glue.list_crawls
— Methodlist_crawls(crawler_name)
list_crawls(crawler_name, params::Dict{String,<:Any})
Returns all the crawls of a specified crawler. Returns only the crawls that have occurred since the launch date of the crawler history feature, and only retains up to 12 months of crawls. Older crawls will not be returned. You may use this API to: Retrive all the crawls of a specified crawler. Retrieve all the crawls of a specified crawler within a limited count. Retrieve all the crawls of a specified crawler in a specific time range. Retrieve all the crawls of a specified crawler with a particular state, crawl ID, or DPU hour value.
Arguments
crawler_name
: The name of the crawler whose runs you want to retrieve.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filters"
: Filters the crawls by the criteria you specify in a list of CrawlsFilter objects."MaxResults"
: The maximum number of results to return. The default is 20, and maximum is 100."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.list_custom_entity_types
— Methodlist_custom_entity_types()
list_custom_entity_types(params::Dict{String,<:Any})
Lists all the custom patterns that have been created.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results."Tags"
: A list of key-value pair tags.
Main.Glue.list_data_quality_results
— Methodlist_data_quality_results()
list_data_quality_results(params::Dict{String,<:Any})
Returns all data quality execution results for your account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter criteria."MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results.
Main.Glue.list_data_quality_rule_recommendation_runs
— Methodlist_data_quality_rule_recommendation_runs()
list_data_quality_rule_recommendation_runs(params::Dict{String,<:Any})
Lists the recommendation runs meeting the filter criteria.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter criteria."MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results.
Main.Glue.list_data_quality_ruleset_evaluation_runs
— Methodlist_data_quality_ruleset_evaluation_runs()
list_data_quality_ruleset_evaluation_runs(params::Dict{String,<:Any})
Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter criteria."MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results.
Main.Glue.list_data_quality_rulesets
— Methodlist_data_quality_rulesets()
list_data_quality_rulesets(params::Dict{String,<:Any})
Returns a paginated list of rulesets for the specified list of Glue tables.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: The filter criteria."MaxResults"
: The maximum number of results to return."NextToken"
: A paginated token to offset the results."Tags"
: A list of key-value pair tags.
Main.Glue.list_dev_endpoints
— Methodlist_dev_endpoints()
list_dev_endpoints(params::Dict{String,<:Any})
Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names. This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Tags"
: Specifies to return only these tagged resources.
Main.Glue.list_jobs
— Methodlist_jobs()
list_jobs(params::Dict{String,<:Any})
Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names. This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Tags"
: Specifies to return only these tagged resources.
Main.Glue.list_mltransforms
— Methodlist_mltransforms()
list_mltransforms(params::Dict{String,<:Any})
Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag. This operation takes the optional Tags field, which you can use as a filter of the responses so that tagged resources can be retrieved as a group. If you choose to use tag filtering, only resources with the tags are retrieved.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Filter"
: A TransformFilterCriteria used to filter the machine learning transforms."MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Sort"
: A TransformSortCriteria used to sort the machine learning transforms."Tags"
: Specifies to return only these tagged resources.
Main.Glue.list_registries
— Methodlist_registries()
list_registries(params::Dict{String,<:Any})
Returns a list of registries that you have created, with minimal registry information. Registries in the Deleting status will not be included in the results. Empty results will be returned if there are no registries available.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.list_schema_versions
— Methodlist_schema_versions(schema_id)
list_schema_versions(schema_id, params::Dict{String,<:Any})
Returns a list of schema versions that you have created, with minimal information. Schema versions in Deleted status will not be included in the results. Empty results will be returned if there are no schema versions available.
Arguments
schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided. SchemaIdSchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.list_schemas
— Methodlist_schemas()
list_schemas(params::Dict{String,<:Any})
Returns a list of schemas with minimal details. Schemas in Deleting status will not be included in the results. Empty results will be returned if there are no schemas available. When the RegistryId is not provided, all the schemas across registries will be part of the API response.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page."NextToken"
: A continuation token, if this is a continuation call."RegistryId"
: A wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Main.Glue.list_sessions
— Methodlist_sessions()
list_sessions(params::Dict{String,<:Any})
Retrieve a list of sessions.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum number of results."NextToken"
: The token for the next set of results, or null if there are no more result."RequestOrigin"
: The origin of the request."Tags"
: Tags belonging to the session.
Main.Glue.list_statements
— Methodlist_statements(session_id)
list_statements(session_id, params::Dict{String,<:Any})
Lists statements for the session.
Arguments
session_id
: The Session ID of the statements.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"NextToken"
: A continuation token, if this is a continuation call."RequestOrigin"
: The origin of the request to list statements.
Main.Glue.list_table_optimizer_runs
— Methodlist_table_optimizer_runs(catalog_id, database_name, table_name, type)
list_table_optimizer_runs(catalog_id, database_name, table_name, type, params::Dict{String,<:Any})
Lists the history of previous optimizer runs for a specific table.
Arguments
catalog_id
: The Catalog ID of the table.database_name
: The name of the database in the catalog in which the table resides.table_name
: The name of the table.type
: The type of table optimizer. Currently, the only valid value is compaction.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum number of optimizer runs to return on each call."NextToken"
: A continuation token, if this is a continuation call.
Main.Glue.list_triggers
— Methodlist_triggers()
list_triggers(params::Dict{String,<:Any})
Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names. This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"DependentJobName"
: The name of the job for which to retrieve triggers. The trigger that can start this job is returned. If there is no such trigger, all triggers are returned."MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request."Tags"
: Specifies to return only these tagged resources.
Main.Glue.list_usage_profiles
— Methodlist_usage_profiles()
list_usage_profiles(params::Dict{String,<:Any})
List all the Glue usage profiles.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum number of usage profiles to return in a single response."NextToken"
: A continuation token, included if this is a continuation call.
Main.Glue.list_workflows
— Methodlist_workflows()
list_workflows(params::Dict{String,<:Any})
Lists names of workflows created in the account.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: The maximum size of a list to return."NextToken"
: A continuation token, if this is a continuation request.
Main.Glue.put_data_catalog_encryption_settings
— Methodput_data_catalog_encryption_settings(data_catalog_encryption_settings)
put_data_catalog_encryption_settings(data_catalog_encryption_settings, params::Dict{String,<:Any})
Sets the security configuration for a specified catalog. After the configuration has been set, the specified encryption is applied to every catalog write thereafter.
Arguments
data_catalog_encryption_settings
: The security configuration to set.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog to set the security configuration for. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.put_resource_policy
— Methodput_resource_policy(policy_in_json)
put_resource_policy(policy_in_json, params::Dict{String,<:Any})
Sets the Data Catalog resource policy for access control.
Arguments
policy_in_json
: Contains the policy document to set, in JSON format.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"EnableHybrid"
: If 'TRUE', indicates that you are using both methods to grant cross-account access to Data Catalog resources: By directly updating the resource policy with PutResourePolicy By using the Grant permissions command on the Amazon Web Services Management Console. Must be set to 'TRUE' if you have already used the Management Console to grant cross-account access, otherwise the call fails. Default is 'FALSE'."PolicyExistsCondition"
: A value of MUSTEXIST is used to update a policy. A value of NOTEXIST is used to create a new policy. If a value of NONE or a null value is used, the call does not depend on the existence of a policy."PolicyHashCondition"
: The hash value returned when the previous policy was set using PutResourcePolicy. Its purpose is to prevent concurrent modifications of a policy. Do not use this parameter if no previous policy has been set."ResourceArn"
: Do not use. For internal use only.
Main.Glue.put_schema_version_metadata
— Methodput_schema_version_metadata(metadata_key_value)
put_schema_version_metadata(metadata_key_value, params::Dict{String,<:Any})
Puts the metadata key value pair for a specified schema version ID. A maximum of 10 key value pairs will be allowed per schema version. They can be added over one or more calls.
Arguments
metadata_key_value
: The metadata key's corresponding value.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"SchemaId"
: The unique ID for the schema."SchemaVersionId"
: The unique version ID of the schema version."SchemaVersionNumber"
: The version number of the schema.
Main.Glue.put_workflow_run_properties
— Methodput_workflow_run_properties(name, run_id, run_properties)
put_workflow_run_properties(name, run_id, run_properties, params::Dict{String,<:Any})
Puts the specified workflow run properties for the given workflow run. If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.
Arguments
name
: Name of the workflow which was run.run_id
: The ID of the workflow run for which the run properties should be updated.run_properties
: The properties to put for the specified run.
Main.Glue.query_schema_version_metadata
— Methodquery_schema_version_metadata()
query_schema_version_metadata(params::Dict{String,<:Any})
Queries for the schema version metadata information.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"MaxResults"
: Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page."MetadataList"
: Search key-value pairs for metadata, if they are not provided all the metadata information will be fetched."NextToken"
: A continuation token, if this is a continuation call."SchemaId"
: A wrapper structure that may contain the schema name and Amazon Resource Name (ARN)."SchemaVersionId"
: The unique version ID of the schema version."SchemaVersionNumber"
: The version number of the schema.
Main.Glue.register_schema_version
— Methodregister_schema_version(schema_definition, schema_id)
register_schema_version(schema_definition, schema_id, params::Dict{String,<:Any})
Adds a new version to the existing schema. Returns an error if new version of schema does not meet the compatibility requirements of the schema set. This API will not create a new schema set and will return a 404 error if the schema set is not already present in the Schema Registry. If this is the first schema definition to be registered in the Schema Registry, this API will store the schema version and return immediately. Otherwise, this call has the potential to run longer than other operations due to compatibility modes. You can call the GetSchemaVersion API with the SchemaVersionId to check compatibility modes. If the same schema definition is already stored in Schema Registry as a version, the schema ID of the existing schema is returned to the caller.
Arguments
schema_definition
: The schema definition using the DataFormat setting for the SchemaName.schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided. SchemaIdSchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
Main.Glue.remove_schema_version_metadata
— Methodremove_schema_version_metadata(metadata_key_value)
remove_schema_version_metadata(metadata_key_value, params::Dict{String,<:Any})
Removes a key value pair from the schema version metadata for the specified schema version ID.
Arguments
metadata_key_value
: The value of the metadata key.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"SchemaId"
: A wrapper structure that may contain the schema name and Amazon Resource Name (ARN)."SchemaVersionId"
: The unique version ID of the schema version."SchemaVersionNumber"
: The version number of the schema.
Main.Glue.reset_job_bookmark
— Methodreset_job_bookmark(job_name)
reset_job_bookmark(job_name, params::Dict{String,<:Any})
Resets a bookmark entry. For more information about enabling and using job bookmarks, see: Tracking processed data using job bookmarks Job parameters used by Glue Job structure
Arguments
job_name
: The name of the job in question.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RunId"
: The unique run identifier associated with this job run.
Main.Glue.resume_workflow_run
— Methodresume_workflow_run(name, node_ids, run_id)
resume_workflow_run(name, node_ids, run_id, params::Dict{String,<:Any})
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run. The selected nodes and all nodes that are downstream from the selected nodes are run.
Arguments
name
: The name of the workflow to resume.node_ids
: A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have a run attempt in the original run.run_id
: The ID of the workflow run to resume.
Main.Glue.run_statement
— Methodrun_statement(code, session_id)
run_statement(code, session_id, params::Dict{String,<:Any})
Executes the statement.
Arguments
code
: The statement code to be run.session_id
: The Session Id of the statement to be run.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The origin of the request.
Main.Glue.search_tables
— Methodsearch_tables()
search_tables(params::Dict{String,<:Any})
Searches a set of tables based on properties in the table metadata as well as on the parent database. You can search against text or filter conditions. You can only get tables that you have access to based on the security policies defined in Lake Formation. You need at least a read-only access to the table for it to be returned. If you do not have access to all the columns in the table, these columns will not be searched against when returning the list of tables back to you. If you have access to the columns but not the data in the columns, those columns and the associated metadata for those columns will be included in the search.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: A unique identifier, consisting of account_id ."Filters"
: A list of key-value pairs, and a comparator used to filter the search results. Returns all entities matching the predicate. The Comparator member of the PropertyPredicate struct is used only for time fields, and can be omitted for other field types. Also, when comparing string values, such as when Key=Name, a fuzzy match algorithm is used. The Key field (for example, the value of the Name field) is split on certain punctuation characters, for example, -, :, #, etc. into tokens. Then each token is exact-match compared with the Value member of PropertyPredicate. For example, if Key=Name and Value=link, tables named customer-link and xx-link-yy are returned, but xxlinkyy is not returned."MaxResults"
: The maximum number of tables to return in a single response."NextToken"
: A continuation token, included if this is a continuation call."ResourceShareType"
: Allows you to specify that you want to search the tables shared with your account. The allowable values are FOREIGN or ALL. If set to FOREIGN, will search the tables shared with your account. If set to ALL, will search the tables shared with your account, as well as the tables in yor local account."SearchText"
: A string used for a text search. Specifying a value in quotes filters based on an exact match to the value."SortCriteria"
: A list of criteria for sorting the results by a field name, in an ascending or descending order.
Main.Glue.start_blueprint_run
— Methodstart_blueprint_run(blueprint_name, role_arn)
start_blueprint_run(blueprint_name, role_arn, params::Dict{String,<:Any})
Starts a new run of the specified blueprint.
Arguments
blueprint_name
: The name of the blueprint.role_arn
: Specifies the IAM role used to create the workflow.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Parameters"
: Specifies the parameters as a BlueprintParameters object.
Main.Glue.start_column_statistics_task_run
— Methodstart_column_statistics_task_run(database_name, role, table_name)
start_column_statistics_task_run(database_name, role, table_name, params::Dict{String,<:Any})
Starts a column statistics task run, for a specified table and columns.
Arguments
database_name
: The name of the database where the table resides.role
: The IAM role that the service assumes to generate statistics.table_name
: The name of the table to generate statistics.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogID"
: The ID of the Data Catalog where the table reside. If none is supplied, the Amazon Web Services account ID is used by default."ColumnNameList"
: A list of the column names to generate statistics. If none is supplied, all column names for the table will be used by default."SampleSize"
: The percentage of rows used to generate statistics. If none is supplied, the entire table will be used to generate stats."SecurityConfiguration"
: Name of the security configuration that is used to encrypt CloudWatch logs for the column stats task run.
Main.Glue.start_crawler
— Methodstart_crawler(name)
start_crawler(name, params::Dict{String,<:Any})
Starts a crawl using the specified crawler, regardless of what is scheduled. If the crawler is already running, returns a CrawlerRunningException.
Arguments
name
: Name of the crawler to start.
Main.Glue.start_crawler_schedule
— Methodstart_crawler_schedule(crawler_name)
start_crawler_schedule(crawler_name, params::Dict{String,<:Any})
Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.
Arguments
crawler_name
: Name of the crawler to schedule.
Main.Glue.start_data_quality_rule_recommendation_run
— Methodstart_data_quality_rule_recommendation_run(data_source, role)
start_data_quality_rule_recommendation_run(data_source, role, params::Dict{String,<:Any})
Starts a recommendation run that is used to generate rules when you don't know what rules to write. Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking. Recommendation runs are automatically deleted after 90 days.
Arguments
data_source
: The data source (Glue table) associated with this run.role
: An IAM role supplied to encrypt the results of the run.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"ClientToken"
: Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource."CreatedRulesetName"
: A name for the ruleset."NumberOfWorkers"
: The number of G.1X workers to be used in the run. The default is 5."Timeout"
: The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).
Main.Glue.start_data_quality_ruleset_evaluation_run
— Methodstart_data_quality_ruleset_evaluation_run(data_source, role, ruleset_names)
start_data_quality_ruleset_evaluation_run(data_source, role, ruleset_names, params::Dict{String,<:Any})
Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table). The evaluation computes results which you can retrieve with the GetDataQualityResult API.
Arguments
data_source
: The data source (Glue table) associated with this run.role
: An IAM role supplied to encrypt the results of the run.ruleset_names
: A list of ruleset names.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AdditionalDataSources"
: A map of reference strings to additional data sources you can specify for an evaluation run."AdditionalRunOptions"
: Additional run options you can specify for an evaluation run."ClientToken"
: Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource."NumberOfWorkers"
: The number of G.1X workers to be used in the run. The default is 5."Timeout"
: The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).
Main.Glue.start_export_labels_task_run
— Methodstart_export_labels_task_run(output_s3_path, transform_id)
start_export_labels_task_run(output_s3_path, transform_id, params::Dict{String,<:Any})
Begins an asynchronous task to export all labeled data for a particular transform. This task is the only label-related API call that is not part of the typical active learning workflow. You typically use StartExportLabelsTaskRun when you want to work with all of your existing labels at the same time, such as when you want to remove or change labels that were previously submitted as truth. This API operation accepts the TransformId whose labels you want to export and an Amazon Simple Storage Service (Amazon S3) path to export the labels to. The operation returns a TaskRunId. You can check on the status of your task run by calling the GetMLTaskRun API.
Arguments
output_s3_path
: The Amazon S3 path where you export the labels.transform_id
: The unique identifier of the machine learning transform.
Main.Glue.start_import_labels_task_run
— Methodstart_import_labels_task_run(input_s3_path, transform_id)
start_import_labels_task_run(input_s3_path, transform_id, params::Dict{String,<:Any})
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality. This API operation is generally used as part of the active learning workflow that starts with the StartMLLabelingSetGenerationTaskRun call and that ultimately results in improving the quality of your machine learning transform. After the StartMLLabelingSetGenerationTaskRun finishes, Glue machine learning will have generated a series of questions for humans to answer. (Answering these questions is often called 'labeling' in the machine learning workflows). In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?” After the labeling process is finished, users upload their answers/labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform use the new and improved labels and perform a higher-quality transformation. By default, StartMLLabelingSetGenerationTaskRun continually learns from and combines all labels that you upload unless you set Replace to true. If you set Replace to true, StartImportLabelsTaskRun deletes and forgets all previously uploaded labels and learns only from the exact set that you upload. Replacing labels can be helpful if you realize that you previously uploaded incorrect labels, and you believe that they are having a negative effect on your transform quality. You can check on the status of your task run by calling the GetMLTaskRun operation.
Arguments
input_s3_path
: The Amazon Simple Storage Service (Amazon S3) path from where you import the labels.transform_id
: The unique identifier of the machine learning transform.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"ReplaceAllLabels"
: Indicates whether to overwrite your existing labels.
Main.Glue.start_job_run
— Methodstart_job_run(job_name)
start_job_run(job_name, params::Dict{String,<:Any})
Starts a job run using a job definition.
Arguments
job_name
: The name of the job definition to use.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AllocatedCapacity"
: This field is deprecated. Use MaxCapacity instead. The number of Glue data processing units (DPUs) to allocate to this JobRun. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page."Arguments"
: The job arguments associated with this run. For this job run, they replace the default arguments set in the job definition itself. You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes. Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job. For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide. For information about the arguments you can provide to this field when configuring Spark jobs, see the Special Parameters Used by Glue topic in the developer guide. For information about the arguments you can provide to this field when configuring Ray jobs, see Using job parameters in Ray jobs in the developer guide."ExecutionClass"
: Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources. The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary. Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs."JobRunId"
: The ID of a previous JobRun to retry."MaxCapacity"
: For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page. For Glue version 2.0+ jobs, you cannot specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers. Do not set MaxCapacity if using WorkerType and NumberOfWorkers. The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL job: When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU. When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation."NotificationProperty"
: Specifies configuration properties of a job run notification."NumberOfWorkers"
: The number of workers of a defined workerType that are allocated when a job runs."SecurityConfiguration"
: The name of the SecurityConfiguration structure to be used with this job run."Timeout"
: The JobRun timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. This value overrides the timeout value set in the parent job. Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days."WorkerType"
: The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs. For the G.1X worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.2X worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs. For the G.4X worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm). For the G.8X worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X worker type. For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs. For the Z.2X worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.
Main.Glue.start_mlevaluation_task_run
— Methodstart_mlevaluation_task_run(transform_id)
start_mlevaluation_task_run(transform_id, params::Dict{String,<:Any})
Starts a task to estimate the quality of the transform. When you provide label sets as examples of truth, Glue machine learning uses some of those examples to learn from them. The rest of the labels are used as a test to estimate quality. Returns a unique identifier for the run. You can call GetMLTaskRun to get more information about the stats of the EvaluationTaskRun.
Arguments
transform_id
: The unique identifier of the machine learning transform.
Main.Glue.start_mllabeling_set_generation_task_run
— Methodstart_mllabeling_set_generation_task_run(output_s3_path, transform_id)
start_mllabeling_set_generation_task_run(output_s3_path, transform_id, params::Dict{String,<:Any})
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels. When the StartMLLabelingSetGenerationTaskRun finishes, Glue will have generated a "labeling set" or a set of questions for humans to answer. In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?” After the labeling process is finished, you can upload your labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform will use the new and improved labels and perform a higher-quality transformation.
Arguments
output_s3_path
: The Amazon Simple Storage Service (Amazon S3) path where you generate the labeling set.transform_id
: The unique identifier of the machine learning transform.
Main.Glue.start_trigger
— Methodstart_trigger(name)
start_trigger(name, params::Dict{String,<:Any})
Starts an existing trigger. See Triggering Jobs for information about how different types of trigger are started.
Arguments
name
: The name of the trigger to start.
Main.Glue.start_workflow_run
— Methodstart_workflow_run(name)
start_workflow_run(name, params::Dict{String,<:Any})
Starts a new run of the specified workflow.
Arguments
name
: The name of the workflow to start.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RunProperties"
: The workflow run properties for the new workflow run.
Main.Glue.stop_column_statistics_task_run
— Methodstop_column_statistics_task_run(database_name, table_name)
stop_column_statistics_task_run(database_name, table_name, params::Dict{String,<:Any})
Stops a task run for the specified table.
Arguments
database_name
: The name of the database where the table resides.table_name
: The name of the table.
Main.Glue.stop_crawler
— Methodstop_crawler(name)
stop_crawler(name, params::Dict{String,<:Any})
If the specified crawler is running, stops the crawl.
Arguments
name
: Name of the crawler to stop.
Main.Glue.stop_crawler_schedule
— Methodstop_crawler_schedule(crawler_name)
stop_crawler_schedule(crawler_name, params::Dict{String,<:Any})
Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.
Arguments
crawler_name
: Name of the crawler whose schedule state to set.
Main.Glue.stop_session
— Methodstop_session(id)
stop_session(id, params::Dict{String,<:Any})
Stops the session.
Arguments
id
: The ID of the session to be stopped.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"RequestOrigin"
: The origin of the request.
Main.Glue.stop_trigger
— Methodstop_trigger(name)
stop_trigger(name, params::Dict{String,<:Any})
Stops a specified trigger.
Arguments
name
: The name of the trigger to stop.
Main.Glue.stop_workflow_run
— Methodstop_workflow_run(name, run_id)
stop_workflow_run(name, run_id, params::Dict{String,<:Any})
Stops the execution of the specified workflow run.
Arguments
name
: The name of the workflow to stop.run_id
: The ID of the workflow run to stop.
Main.Glue.tag_resource
— Methodtag_resource(resource_arn, tags_to_add)
tag_resource(resource_arn, tags_to_add, params::Dict{String,<:Any})
Adds tags to a resource. A tag is a label you can assign to an Amazon Web Services resource. In Glue, you can tag only certain resources. For information about what resources you can tag, see Amazon Web Services Tags in Glue.
Arguments
resource_arn
: The ARN of the Glue resource to which to add the tags. For more information about Glue resource ARNs, see the Glue ARN string pattern.tags_to_add
: Tags to add to this resource.
Main.Glue.untag_resource
— Methoduntag_resource(resource_arn, tags_to_remove)
untag_resource(resource_arn, tags_to_remove, params::Dict{String,<:Any})
Removes tags from a resource.
Arguments
resource_arn
: The Amazon Resource Name (ARN) of the resource from which to remove the tags.tags_to_remove
: Tags to remove from this resource.
Main.Glue.update_blueprint
— Methodupdate_blueprint(blueprint_location, name)
update_blueprint(blueprint_location, name, params::Dict{String,<:Any})
Updates a registered blueprint.
Arguments
blueprint_location
: Specifies a path in Amazon S3 where the blueprint is published.name
: The name of the blueprint.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the blueprint.
Main.Glue.update_classifier
— Methodupdate_classifier()
update_classifier(params::Dict{String,<:Any})
Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CsvClassifier"
: A CsvClassifier object with updated fields."GrokClassifier"
: A GrokClassifier object with updated fields."JsonClassifier"
: A JsonClassifier object with updated fields."XMLClassifier"
: An XMLClassifier object with updated fields.
Main.Glue.update_column_statistics_for_partition
— Methodupdate_column_statistics_for_partition(column_statistics_list, database_name, partition_values, table_name)
update_column_statistics_for_partition(column_statistics_list, database_name, partition_values, table_name, params::Dict{String,<:Any})
Creates or updates partition statistics of columns. The Identity and Access Management (IAM) permission required for this operation is UpdatePartition.
Arguments
column_statistics_list
: A list of the column statistics.database_name
: The name of the catalog database where the partitions reside.partition_values
: A list of partition values identifying the partition.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.update_column_statistics_for_table
— Methodupdate_column_statistics_for_table(column_statistics_list, database_name, table_name)
update_column_statistics_for_table(column_statistics_list, database_name, table_name, params::Dict{String,<:Any})
Creates or updates table statistics of columns. The Identity and Access Management (IAM) permission required for this operation is UpdateTable.
Arguments
column_statistics_list
: A list of the column statistics.database_name
: The name of the catalog database where the partitions reside.table_name
: The name of the partitions' table.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.
Main.Glue.update_connection
— Methodupdate_connection(connection_input, name)
update_connection(connection_input, name, params::Dict{String,<:Any})
Updates a connection definition in the Data Catalog.
Arguments
connection_input
: A ConnectionInput object that redefines the connection in question.name
: The name of the connection definition to update.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.update_crawler
— Methodupdate_crawler(name)
update_crawler(name, params::Dict{String,<:Any})
Updates a crawler. If a crawler is running, you must stop it using StopCrawler before updating it.
Arguments
name
: Name of the new crawler.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Classifiers"
: A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification."Configuration"
: Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options."CrawlerSecurityConfiguration"
: The name of the SecurityConfiguration structure to be used by this crawler."DatabaseName"
: The Glue database where results are stored, such as: arn:aws:daylight:us-east-1::database/sometable/*."Description"
: A description of the new crawler."LakeFormationConfiguration"
: Specifies Lake Formation configuration settings for the crawler."LineageConfiguration"
: Specifies data lineage configuration settings for the crawler."RecrawlPolicy"
: A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run."Role"
: The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler to access customer resources."Schedule"
: A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *)."SchemaChangePolicy"
: The policy for the crawler's update and deletion behavior."TablePrefix"
: The table prefix used for catalog tables that are created."Targets"
: A list of targets to crawl.
Main.Glue.update_crawler_schedule
— Methodupdate_crawler_schedule(crawler_name)
update_crawler_schedule(crawler_name, params::Dict{String,<:Any})
Updates the schedule of a crawler using a cron expression.
Arguments
crawler_name
: The name of the crawler whose schedule to update.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Schedule"
: The updated cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).
Main.Glue.update_data_quality_ruleset
— Methodupdate_data_quality_ruleset(name)
update_data_quality_ruleset(name, params::Dict{String,<:Any})
Updates the specified data quality ruleset.
Arguments
name
: The name of the data quality ruleset.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the ruleset."Ruleset"
: A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.
Main.Glue.update_database
— Methodupdate_database(database_input, name)
update_database(database_input, name, params::Dict{String,<:Any})
Updates an existing database definition in a Data Catalog.
Arguments
database_input
: A DatabaseInput object specifying the new definition of the metadata database in the catalog.name
: The name of the database to update in the catalog. For Hive compatibility, this is folded to lowercase.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog in which the metadata database resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.update_dev_endpoint
— Methodupdate_dev_endpoint(endpoint_name)
update_dev_endpoint(endpoint_name, params::Dict{String,<:Any})
Updates a specified development endpoint.
Arguments
endpoint_name
: The name of the DevEndpoint to be updated.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AddArguments"
: The map of arguments to add the map of arguments used to configure the DevEndpoint. Valid arguments are: "–enable-glue-datacatalog": "" You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2."AddPublicKeys"
: The list of public keys for the DevEndpoint to use."CustomLibraries"
: Custom Python or Java libraries to be loaded in the DevEndpoint."DeleteArguments"
: The list of argument keys to be deleted from the map of arguments used to configure the DevEndpoint."DeletePublicKeys"
: The list of public keys to be deleted from the DevEndpoint."PublicKey"
: The public key for the DevEndpoint to use."UpdateEtlLibraries"
: True if the list of custom libraries to be loaded in the development endpoint needs to be updated, or False if otherwise.
Main.Glue.update_job
— Methodupdate_job(job_name, job_update)
update_job(job_name, job_update, params::Dict{String,<:Any})
Updates an existing job definition. The previous job definition is completely overwritten by this information.
Arguments
job_name
: The name of the job definition to update.job_update
: Specifies the values with which to update the job definition. Unspecified configuration is removed or reset to default values.
Main.Glue.update_job_from_source_control
— Methodupdate_job_from_source_control()
update_job_from_source_control(params::Dict{String,<:Any})
Synchronizes a job from the source control repository. This operation takes the job artifacts that are located in the remote repository and updates the Glue internal stores with these artifacts. This API supports optional parameters which take in the repository information.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AuthStrategy"
: The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token."AuthToken"
: The value of the authorization token."BranchName"
: An optional branch in the remote repository."CommitId"
: A commit ID for a commit in the remote repository."Folder"
: An optional folder in the remote repository."JobName"
: The name of the Glue job to be synchronized to or from the remote repository."Provider"
: The provider for the remote repository. Possible values: GITHUB, AWSCODECOMMIT, GITLAB, BITBUCKET."RepositoryName"
: The name of the remote repository that contains the job artifacts. For BitBucket providers, RepositoryName should include WorkspaceName. Use the format <WorkspaceName>/<RepositoryName>."RepositoryOwner"
: The owner of the remote repository that contains the job artifacts.
Main.Glue.update_mltransform
— Methodupdate_mltransform(transform_id)
update_mltransform(transform_id, params::Dict{String,<:Any})
Updates an existing machine learning transform. Call this operation to tune the algorithm parameters to achieve better results. After calling this operation, you can call the StartMLEvaluationTaskRun operation to assess how well your new parameters achieved your goals (such as improving the quality of your machine learning transform, or making it more cost-effective).
Arguments
transform_id
: A unique identifier that was generated when the transform was created.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the transform. The default is an empty string."GlueVersion"
: This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide."MaxCapacity"
: The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page. When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only."MaxRetries"
: The maximum number of times to retry a task for this transform after a task run fails."Name"
: The unique name that you gave the transform when you created it."NumberOfWorkers"
: The number of workers of a defined workerType that are allocated when this task runs."Parameters"
: The configuration parameters that are specific to the transform type (algorithm) used. Conditionally dependent on the transform type."Role"
: The name or Amazon Resource Name (ARN) of the IAM role with the required permissions."Timeout"
: The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours)."WorkerType"
: The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X. For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
Main.Glue.update_partition
— Methodupdate_partition(database_name, partition_input, partition_value_list, table_name)
update_partition(database_name, partition_input, partition_value_list, table_name, params::Dict{String,<:Any})
Updates a partition.
Arguments
database_name
: The name of the catalog database in which the table in question resides.partition_input
: The new partition object to update the partition to. The Values property can't be changed. If you want to change the partition key values for a partition, delete and recreate the partition.partition_value_list
: List of partition key values that define the partition to update.table_name
: The name of the table in which the partition to be updated is located.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the partition to be updated resides. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.update_registry
— Methodupdate_registry(description, registry_id)
update_registry(description, registry_id, params::Dict{String,<:Any})
Updates an existing registry which is used to hold a collection of schemas. The updated properties relate to the registry, and do not modify any of the schemas within the registry.
Arguments
description
: A description of the registry. If description is not provided, this field will not be updated.registry_id
: This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Main.Glue.update_schema
— Methodupdate_schema(schema_id)
update_schema(schema_id, params::Dict{String,<:Any})
Updates the description, compatibility setting, or version checkpoint for a schema set. For updating the compatibility setting, the call will not validate compatibility for the entire set of schema versions with the new compatibility setting. If the value for Compatibility is provided, the VersionNumber (a checkpoint) is also required. The API will validate the checkpoint version number for consistency. If the value for the VersionNumber (checkpoint) is provided, Compatibility is optional and this can be used to set/reset a checkpoint for the schema. This update will happen only if the schema is in the AVAILABLE state.
Arguments
schema_id
: This is a wrapper structure to contain schema identity fields. The structure contains: SchemaIdSchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided. SchemaIdSchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Compatibility"
: The new compatibility setting for the schema."Description"
: The new description for the schema."SchemaVersionNumber"
: Version number required for check pointing. One of VersionNumber or Compatibility has to be provided.
Main.Glue.update_source_control_from_job
— Methodupdate_source_control_from_job()
update_source_control_from_job(params::Dict{String,<:Any})
Synchronizes a job to the source control repository. This operation takes the job artifacts from the Glue internal stores and makes a commit to the remote repository that is configured on the job. This API supports optional parameters which take in the repository information.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"AuthStrategy"
: The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token."AuthToken"
: The value of the authorization token."BranchName"
: An optional branch in the remote repository."CommitId"
: A commit ID for a commit in the remote repository."Folder"
: An optional folder in the remote repository."JobName"
: The name of the Glue job to be synchronized to or from the remote repository."Provider"
: The provider for the remote repository. Possible values: GITHUB, AWSCODECOMMIT, GITLAB, BITBUCKET."RepositoryName"
: The name of the remote repository that contains the job artifacts. For BitBucket providers, RepositoryName should include WorkspaceName. Use the format <WorkspaceName>/<RepositoryName>."RepositoryOwner"
: The owner of the remote repository that contains the job artifacts.
Main.Glue.update_table
— Methodupdate_table(database_name, table_input)
update_table(database_name, table_input, params::Dict{String,<:Any})
Updates a metadata table in the Data Catalog.
Arguments
database_name
: The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.table_input
: An updated TableInput object to define the metadata table in the catalog.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default."Force"
: A flag that can be set to true to ignore matching storage descriptor and subobject matching requirements."SkipArchive"
: By default, UpdateTable always creates an archived version of the table before updating it. However, if skipArchive is set to true, UpdateTable does not create the archived version."TransactionId"
: The transaction ID at which to update the table contents."VersionId"
: The version ID at which to update the table contents."ViewUpdateAction"
: The operation to be performed when updating the view.
Main.Glue.update_table_optimizer
— Methodupdate_table_optimizer(catalog_id, database_name, table_name, table_optimizer_configuration, type)
update_table_optimizer(catalog_id, database_name, table_name, table_optimizer_configuration, type, params::Dict{String,<:Any})
Updates the configuration for an existing table optimizer.
Arguments
catalog_id
: The Catalog ID of the table.database_name
: The name of the database in the catalog in which the table resides.table_name
: The name of the table.table_optimizer_configuration
: A TableOptimizerConfiguration object representing the configuration of a table optimizer.type
: The type of table optimizer. Currently, the only valid value is compaction.
Main.Glue.update_trigger
— Methodupdate_trigger(name, trigger_update)
update_trigger(name, trigger_update, params::Dict{String,<:Any})
Updates a trigger definition.
Arguments
name
: The name of the trigger to update.trigger_update
: The new values with which to update the trigger.
Main.Glue.update_usage_profile
— Methodupdate_usage_profile(configuration, name)
update_usage_profile(configuration, name, params::Dict{String,<:Any})
Update an Glue usage profile.
Arguments
configuration
: A ProfileConfiguration object specifying the job and session values for the profile.name
: The name of the usage profile.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"Description"
: A description of the usage profile.
Main.Glue.update_user_defined_function
— Methodupdate_user_defined_function(database_name, function_input, function_name)
update_user_defined_function(database_name, function_input, function_name, params::Dict{String,<:Any})
Updates an existing function definition in the Data Catalog.
Arguments
database_name
: The name of the catalog database where the function to be updated is located.function_input
: A FunctionInput object that redefines the function in the Data Catalog.function_name
: The name of the function.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"CatalogId"
: The ID of the Data Catalog where the function to be updated is located. If none is provided, the Amazon Web Services account ID is used by default.
Main.Glue.update_workflow
— Methodupdate_workflow(name)
update_workflow(name, params::Dict{String,<:Any})
Updates an existing workflow.
Arguments
name
: Name of the workflow to be updated.
Optional Parameters
Optional parameters can be passed as a params::Dict{String,<:Any}
. Valid keys are:
"DefaultRunProperties"
: A collection of properties to be used as part of each execution of the workflow."Description"
: The description of the workflow."MaxConcurrentRuns"
: You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.