Getting started with Gilio API

Gilio is a platform designed for developers and businesses seeking to optimize the processing of large volumes of documents in their operational processes and digital products. With Gilio, you can ingest, extract, and transform structured information from various types of documents, configuring structure references and layouts that ensure standardized and tailored outputs to meet your needs.

Gilio API evaluates and orchestrates language models (LLMs) using advanced techniques in prompting, parsing, and logical reasoning, achieving high levels of accuracy even in documents with complex structures or characteristics.


Signup and receive your API Key

First, fill out and submit the form at gilio.co/signup. You will immediately receive your API Key and Customer ID via email. To get started, you can test the endpoints from the documentation or with your favorite client such as Insomnia, Postman, cURL, among others. Simply enter your API Key in the "X-Api-Key" header for all requests.

Validate your information with the GET /customers/{customer_id} endpoint.

curl -X GET https://api.gilio.co/v1/customers/{customer_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY"

You will receive something like the response below:

{
	"enabled": true,
	"business_sector": "PROGRAM_DEVELOPMENT",
	"demo_pages_left": 100,
	"company_name": "SENSIA ECUADOR S.A.",
	"created_at": "2024-06-13T17:46:56Z",
	"personal_data_acceptance": true,
	"plan": "enterprise",
	"contact_details": {
		"email": "pparedes@sensia.ec",
		"lastname": "Paredes Meza",
		"name": "Pedro",
		"phone": "0000111122233333",
		"role": "Jefe de Operaciones"
	},
	"customer_id": "f6b78b0c-d404-47ab-906b-6aec72de5008",
	"demo": false,
	"country": "Ecuador"
}

You can also edit the allowed fields using the PUT /customers/{customer_id} endpoint. Follow the example below.

curl -X PUT https://api.gilio.co/v1/customers/{customer_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "contact_details": {
                "lastname": "Paredes Meza"
            }
         }'


Preparing the Training

A Tag is a referential model for both the JSON structure and a digitized HTML/PDF extracted from a document. This ensures a standardized process and visual appearance for your other documents in the same category. Gilio is highly customizable and you can configure:

  • Instructions: Specify details or specific requirements in your structures, such as identifying ambiguous fields, specific transformations, changes in the type of data captured, among others.

  • Required Pages: If your document has multiple pages but you only need to extract information from some of them, you can configure which pages should be considered for processing. If your document has a single page or an image, add "1" (first page) as the only value.

  • Digitization: If you have handwritten documents, besides extracting the information, you can create a digital version of it in PDF, maintaining the same visual appearance as the original document or creating an entirely new style. To do this, set the "output_resources" field to true.


To create a Tag, use the POST /tag/create endpoint. You can follow the example below:

$ curl -X POST https://api.gilio.co/v1/tag/create \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "customer_id": "YOUR_CUSTOMER_ID",
            "tag_name": "ORDEN_INSTALACION",
            "description": "Orden de Instalacion de Equipos",
            "output_resources": false,
            "required_pages": [1],
            "json_instructions": "Genera respuestas concisas, unicamente la estructura JSON valida y lista para usar, sin caracteres adicionales",
            "html_instructions": "Crea una plantilla facil de transformar a PDF y que sea visualmente atractiva. Genera de forma concisa, unicamente la estructura HTML lista para usar, sin caracteres adicionales"
         }'


In the response you should find the Tag ID at the "tag_id" field. You should use it in the next requests for the file processing.

{
	"customer_id": "f6b78b0c-d404-47ab-906b-6aec72de5008",
	"tag_name": "ORDEN_INSTALACION",
	...
	"tag_id": "baf01f92-8fef-49b7-8cb3-29f46e1bb993",
	"created_at": "2024-06-15 22:55:48.116388"
}


Upload Your First File for Training 🏋🏻‍♂️

The first processing will always be a Training. This creates a reliable reference to validate and generate standard outputs for future documents in the same category. Use your preferred platform or cloud service (Google Drive, OneDrive, Odoo, AWS S3, SAP, among others) and send the URL of the files you want to process. Start the training by sending the value "training" in the "process" field.

To upload a file, use the POST /files/{tag_id} endpoint as shown in the example below:

curl -X POST https://api.gilio.co/v1/files/{tag_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "file_url": "https://drive.google.com/uc?export=download&id=1VUOFV0jxnbVMAcIIFRuHZx0tupcTzsZx",
            "process_type": "training"
         }'


🛎️ Important

Use URLs that point to the file download. For example, Google Drive URLs by default point to a web view that cannot be processed. You will need to use your file ID (capture it from the URL) and use it in a download URL, as shown in the example.

You can also send the file content as Base64 through the "file_content" field.


Types of Processing

There are 4 types of processing available:

  • training: Simultaneous training of the JSON structure and HTML/PDF appearance.

  • json_training: Training of the JSON structure only.

  • html_training: Training of the PDF/HTML document appearance (required if you have the "output_resources" option enabled).

  • vision: Processing of a document. Remember, you must always have trained a Tag beforehand for accurate validations.


Explore and validate the results 🔍

The last request should retrieve the next response

{
	"request_id": "507959f6-f4a1-4424-ac84-065c8e946980",
	"file_path": "source_files/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/vision_507959f6-f4a1-4424-ac84-065c8e946980.pdf"
}


Use the "request_id" Field to Check the Status of Your Document Processing. Make a request to the GET /files/request/{request_id} endpoint, and you will receive a response as shown in the example below:

{
	"size": "42886",
	"process_id": "6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c",
	"created_at": "2024-06-16 13:48:33",
	"analysis_keys": [
		"results/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c/page-0.jpeg"
	],
	"status": "SUCCESS_PDF_GENERATED",
	"customer_id": "f6b78b0c-d404-47ab-906b-6aec72de5008",
	"source_path": "source_files/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/vision_507959f6-f4a1-4424-ac84-065c8e946980.pdf",
	"pages_count": "1",
	"results_path": "results/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c",
	"file_path": "source_files/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/vision_507959f6-f4a1-4424-ac84-065c8e946980.pdf",
	"request_id": "507959f6-f4a1-4424-ac84-065c8e946980",
	"process_type": "vision",
	"number_of_pages": "1",
	"source_type": "source_files",
	"upload_type": "url",
    "results": [
		"payload.json",
		"template.html"
	],
	"tag_id": "baf01f92-8fef-49b7-8cb3-29f46e1bb993"
}

Review the Outputs Generated by the Training. If you are satisfied with the results, copy the paths corresponding to the "json_path" and "html_path" fields and prepend the file names found in the "results" field to update your Tag. This path should start with the prefix "results_training/".

Make a request to the PUT /tags/{tag_id} endpoint to update the information related to your Tag:

curl -X PUT https://api.gilio.co/v1/files/{tag_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "webhoook": "https://YOUR_WEBHOOK_URL",
            "json_instructions": "Genera respuestas concisas, unicamente la estructura JSON valida y lista para usar, sin caracteres adicionales",
            "html_instructions": "Crea vista HTML facil de imprimir como PDF y que sea visualmente atractiva. Genera de forma concisa, unicamente la estructura HTML lista para usar, sin caracteres adicionales",
            "output_resources": true,
            "json_path": "results_training/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/44bf2d72-ae10-9687-0237-088d612853c7/payload.json",
            "html_path": "results_training/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/44bf2d72-ae10-9687-0237-088d612853c7/template.html"
         }'


🛎️ Important

  • Do not forget to update your Tag with the generated paths in json_path and html_path. This ensures Gilio recognizes the created references for processing your documents. Failing to do so may result in errors or unexpected outputs.

  • You can edit your instructions as many times as needed until you achieve the desired outputs.


Process Your First Real File 📃

With your Tag created and trained, start processing all other files using the value "vision" as the "process_type". Remember, you can check the status of your document processing with the "request_id" field. The available statuses are:

  • SUCCESS_ANALYZE_DOCUMENT: Initial validations.

  • SUCCESS_JSON_GENERATED: JSON structure successfully generated.

  • SUCCESS_HTML_GENERATED: HTML structure successfully generated.

  • SUCCESS_PDF_GENERATED: PDF document successfully generated (if "output_resources" is enabled).

  • SUCCESS_IMAGE_GENERATED: HTML document successfully generated (if "output_resources" is enabled).

Here is an example to start processing a file using the POST /files/{tag_id} endpoint:

curl -X POST https://api.gilio.co/v1/files/{tag_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "file_url": "https://drive.google.com/uc?export=download&id=1VUOFV0jxnbVMAcIIFRuHZx0tupcTzsZx",
            "process_type": "vision"
         }'


You should receive a response like the example below. The field "request_id" is very important to check the status of the processing with the next request.

{
	"request_id": "c1a25bd0-ee54-4a1f-a6f8-60c65269ad85",
	"source_path": "vision/c7a9894a-6cb5-4075-9f2f-ccaa66201b65/1b5f34dd-4aa3-4818-90de-8c2c75b973a0/2024/06/vision_c1n25bd0-ee54-4a1f-a6f8-60c65269ad85.pdf"
}


Make a request to the endpoint GET /files/request/{request_id} using the one generated before. You may poll retrieve the information from this endpoint until the processing retrieves the json_payload field and reach the final status.

curl -X GET https://api.gilio.co/v1/files/request/{request_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY"


This is the response of a finally processed document:

	{
		"json_payload": "{\n    \"title\": \"PARTE DE OPERACION DEL EQUIPO\",\n    \"page_1\": {\n        \"key_values\": {\n            \"fecha\": \"2024-02-05\",\n            \"equipo\": \"EXCAVADORA\",\n            \"marca\": \"CAT 320\",\n            \"obra\": \"PREFECTURA DEL GUAYAS\",\n            \"operador\": \"LEONARDO ANTHONIO\",\n            \"codigo\": \"MB-362\",\n            \"horometro_hora_inicial\": \"5449\",\n            \"horometro_hora_final\": \"5459\",\n            \"diferencia_horometro\": \"10\",\n            \"km_inicial\": \"\",\n            \"km_final\": \"\",\n            \"jornada_hora_inicial\": \"07:00\",\n            \"jornada_hora_final\": \"18:00\",\n            \"suman_horas_trabajadas\": \"10\",\n            \"codigo_paralizacion\": \"15\",\n            \"paralizacion_inicio\": \"12:00\",\n            \"paralizacion_fin\": \"13:00\",\n            \"suman_horas_paralizadas\": \"1\",\n            \"observaciones\": \"LIMPIANDO PALISADO Y AMPLIACION DE ESTERO QUINTERO MESADO ABAJO\",\n            \"numero_de_referencia\": \"2511642\"\n        },\n        \"table_0\": {\n            \"horas_trabajadas\": {\n                \"horometro_hora_inicial\": \"5449\",\n                \"horometro_hora_final\": \"5459\",\n                \"diferencia_horometro\": \"10\",\n                \"jornada_hora_inicial\": \"07:00\",\n                \"jornada_hora_final\": \"18:00\",\n                \"jornadas\": [\n                    {\n                        \"desde\": \"07:00\",\n                        \"hasta\": \"12:00\",\n                        \"total\": 5\n                    },\n                    {\n                        \"desde\": \"13:00\",\n                        \"hasta\": \"18:00\",\n                        \"total\": 5\n                    }\n                ],\n                \"suma_horas_trabajadas\": \"10\"\n            }\n        },\n        \"table_1\": {\n            \"paralizaciones\": [\n                {\n                    \"codigo\": \"15\",\n                    \"inicio\": \"12:00\",\n                    \"fin\": \"13:00\",\n                    \"total\": 1\n                },\n                {\n                    \"codigo\": \"03\",\n                    \"motivo\": \"CLIMA (LLUVIA - NEBLINA)\"\n                },\n                {\n                    \"codigo\": \"05\",\n                    \"motivo\": \"SIN OPERADOR O CHOFER\"\n                },\n                {\n                    \"codigo\": \"07\",\n                    \"motivo\": \"FALTA DE AREA\"\n                },\n                {\n                    \"codigo\": \"08\",\n                    \"motivo\": \"PARADA POR DAÑO O REPARACION\"\n                },\n                {\n                    \"codigo\": \"09\",\n                    \"motivo\": \"MANTENIMIENTO\"\n                },\n                {\n                    \"codigo\": \"12\",\n                    \"motivo\": \"FALTA DE COMBUSTIBLE\"\n                },\n                {\n                    \"codigo\": \"16\",\n                    \"motivo\": \"TRASLADO DE EQUIPO\"\n                },\n                {\n                    \"codigo\": \"17\",\n                    \"motivo\": \"ABASTECIMIENTO\"\n                },\n                {\n                    \"codigo\": \"18\",\n                    \"motivo\": \"MANTEN. O REPARACION (EQ. ENCENDIDO)\"\n                }\n            ],\n            \"suman_horas_paralizadas\": \"1\"\n        },\n        \"warnings\": []\n    }\n}",
		"size": "181062",
		"process_id": "4882b20a-9dfc-fd4c-82cc-68fa09b3eaba",
		"created_at": "2024-06-25 14:40:40",
		"analysis_keys": [
			"source_files/c7a9894a-6cb5-4075-9f2f-ccaa66201b65/fee75b61-41f7-468d-9e30-3d0cd64256c5/2024/06/vision_d194b181-70c5-41bb-a9fc-c80f905bacf7.jpeg"
		],
		"status": "SUCCESS_PDF_GENERATED",
		"customer_id": "c7a9894a-6cb5-4075-9f2f-ccaa66201b65",
		"source_path": "source_files/c7a9894a-6cb5-4075-9f2f-ccaa66201b65/fee75b61-41f7-468d-9e30-3d0cd64256c5/2024/06/vision_d194b181-70c5-41bb-a9fc-c80f905bacf7.jpeg",
		"pages_count": "1",
		"results_path": "results/c7a9894a-6cb5-4075-9f2f-ccaa66201b65/fee75b61-41f7-468d-9e30-3d0cd64256c5/2024/06/4882b20a-9dfc-fd4c-82cc-68fa09b3eaba",
		"request_id": "d194b181-70c5-41bb-a9fc-c80f905bacf7",
		"process_type": "vision",
		"number_of_pages": "0",
		"results": [
			"payload.json",
			"template-0.html",
			"template-0.pdf"
		],
		"upload_type": "url",
		"source_type": "source_files",
		"tag_id": "fee75b61-41f7-468d-9e30-3d0cd64256c5"
	},


Enjoy your results, find out the JSON payload in "json_payload". Remember that you can train the Tag again and add up Instructions to enhance the process and their results!

Configure Webhooks 🌐

Set up a Webhook for each Tag to receive notifications about status changes in document processing. This allows you to, for instance, trigger tasks in automation tools when a document changes status or completes its processing. You can also use the ready JSON structure to compose a dynamic email or a PDF to send as an attachment.

curl -X PUT https://api.gilio.co/v1/files/{tag_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY" \
     -d '{
            "webhoook": "https://YOUR_WEBHOOK_URL",
         }


🛎️ Important

  • Webhooks must be public POST URLs that process bodies in JSON format.


Manage Your Tags and Files 🗃️

You can create as many Tags as you need based on the types of documents you want to process. Retrieve a list of them and their configurations using the GET /tags/customer/{customer_id} endpoint:

curl -X GET https://api.gilio.co/v1/tags/customer/{customer_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY"

The response should be something like this:

[
	{
		"updated_at": "2024-06-16T13:48:28.964025",
		"created_at": "2024-06-15 22:55:48.116388",
		"html_instructions": "Crea vista HTML facil de imprimir como PDF y que sea visualmente atractiva. Genera de forma concisa, unicamente la estructura HTML lista para usar, sin caracteres adicionales",
		"json_path": "results_training/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/44bf2d72-ae10-9687-0237-088d612853c7/payload.json",
		"required_pages": [
			1
		],
		"html_path": "results_training/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/44bf2d72-ae10-9687-0237-088d612853c7/template.html",
		"json_instructions": "Genera respuestas concisas, unicamente la estructura JSON valida y lista para usar, sin caracteres adicionales",
		"customer_id": "f6b78b0c-d404-47ab-906b-6aec72de5008",
		"output_resources": true,
		"tag_name": "ORDEN_INSTALACION",
		"description": "Orden de Instalacion de Equipos",
		"tag_id": "baf01f92-8fef-49b7-8cb3-29f46e1bb993"
	},
    {
      ...
    }
]


Retrieve a list of processed files, their statuses, results, and available metadata by making a request to the GET /files/{tag_id} endpoint:

curl -X GET https://api.gilio.co/v1/files/{tag_id} \
     -H "Content-Type: application/json" \
     -H "X-Api-Key: YOUR_API_KEY"

Your response should look like this:

[
	{
		"size": 42886,
		"process_id": "6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c",
		"created_at": "2024-06-16 13:48:33",
		"analysis_keys": [
			"results/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c/page-0.jpeg"
		],
		"status": "SUCCESS_ANALYZING_DOCUMENT",
		"customer_id": "f6b78b0c-d404-47ab-906b-6aec72de5008",
		"source_path": "source_files/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/vision_507959f6-f4a1-4424-ac84-065c8e946980.pdf",
		"pages_count": 1,
		"results_path": "results/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/6e8f2f0b-87bc-5bbc-723d-6b63bb982b6c",
		"file_path": "source_files/f6b78b0c-d404-47ab-906b-6aec72de5008/baf01f92-8fef-49b7-8cb3-29f46e1bb993/2024/06/vision_507959f6-f4a1-4424-ac84-065c8e946980.pdf",
		"request_id": "507959f6-f4a1-4424-ac84-065c8e946980",
		"process_type": "vision",
		"number_of_pages": 1,
		"upload_type": "url",
		"source_type": "source_files",
		"tag_id": "baf01f92-8fef-49b7-8cb3-29f46e1bb993"
	},
	{
		"size": 42886,
		"process_id": "aca52db7-024a-97af-1d16-21f9953e1ffc",
		"created_at": "2024-06-16 13:29:50",
      ...
    }
  ]