APEX: Integration Playground - Adobe PDF Services

Sydney NurseSydney Nurse
14 min read

Many of us are leveraging various APIs, REST, AI services to generate text, gain a better understanding of content and having real data driven conversations with business data and content.

In this, hopefully first of many, post I start to experiment with services in the area of understanding & sharing content. For all of the candidates I will be using the Free, very important, Developer edition of the service APIs and limits.

Today’s test set of APIs are Adobe PDF Services that will help be curate content for an application that I am working on. You can Get Started as an Adobe Developer and start testing today.

About this Post

My current projects or question workload are based on REST, JSON, or AI related topics, such as Vector Search, Document Understanding or Generative AI. I do not claim to an expert in this areas but they are in my current research and day to day activities as I work with APEX.

Oracle Cloud Infrastructure (OCI) does provide an AI service for Document understanding but I am subject to tenant and service limits just like any other average Joe Schmo. Working around these constraints I figure why not give someone else a chance to shine.

Use Case

Transcribe, summarise, Tag users upload content, so that it may be used to support user enquiries via conversations.

  • For Document Types

    • IF (TYPE ≠ ‘Document’) THEN Convert to PDF

    • All Content should be stored in an OCI Object Storage Bucket

    • The Content’s Text should be Extracted

    • A Summary & Tags of the Content’s Text should be Generated

    • Embeddings for the Content’s Text should be Generated

A similar set of requirements are performed for Audio and Video content but that is something separate and touched on in my post APEX: OCI Speech Integration. The entries highlighted in bold will be covered today.

Assumptions

The documentation is fairly straight forward and gives enough curl examples that make it simple to translate into APEX calls.

Adobe PDF Services

Adobe PDF Services offer great Adobe document services

  • Creating a PDF from multiple formats, including HTML, Microsoft Office documents, and text files

  • Exporting a PDF to other formats or an image

  • Combining entire PDFs or specified page ranges

  • Using OCR to make a PDF file searchable with a custom locale

  • Compress PDFs with compression level and Linearize PDFs

  • Protect PDFs with password(s) and Remove password protection from PDFs

  • Common page operations, including inserting, replacing, deleting, reordering, and rotating

  • Splitting PDFs into multiple files

  • Extract PDF as JSON: the content, structure & renditions of table and figure elements along with Character Bounding Boxes

  • Get the properties of a PDF file like page count, PDF version, file size, compliance levels, font info, permissions and more

  • Improving the accessibility of PDFs (Available under Early Access Program)

You can explore more about these APIs with the sample Postman Collection (Zip download - link is subject to change and if you don’t trust me, get it from the API link provided).

The API round up will include:

  • Getting an Access Token

  • Uploading an Asset

  • Creating a Job & Checking its Status

  • Extract PDF as JSON

  • Downloading an Asset

Getting an Access Token

Adobe has decided that instead of providing authentication to the service with username(client id) + pwd (client secret) combination that can be stored as APEX Web Credential, it expects these as parameters in the Body of the request.

This is not ideal and alarm bells may be ringing here but it is over a SSL connection.

In the World of APEX we can use REST Data Sources with these passed as Bind variables parameters. I’ve done the same using params.

declare
    l_client_id             VARCHAR2(1000)  := 'Your Client ID';
    l_client_secret         VARCHAR2(1000)  := 'Your Client Secret';
    l_response_clob         CLOB;
    l_rest_token            VARCHAR2(1000);
    l_token_url             VARCHAR2(1000)  := 'https://pdf-services-ue1.adobe.io/token';
    l_parm_names            apex_application_global.vc_arr2;
    l_parm_values           apex_application_global.vc_arr2;
begin

    select filename, mime_type, blob_contents
     into l_filename, l_mime_type, l_blob_content 
     from AICD_MEDIA
    where id = :P105_MEDIA_ID;

    -- Adobe PDF Services
    -- Setup up initial parameters

    l_parm_names(1)     := 'client_id';
    l_parm_values(1)    := l_client_id;
    l_parm_names(2)     := 'client_secret';
    l_parm_values(2)    := l_client_secret;

    -- Get Token
    apex_web_service.g_request_headers.delete();
    apex_web_service.g_request_headers(1).name  := 'Content-Type';
    apex_web_service.g_request_headers(1).value := 'application/x-www-form-urlencoded';

    l_rest_token := JSON_VALUE(apex_web_service.make_rest_request(
    p_url                       => l_token_url,
    p_http_method               => 'POST',
    p_parm_name                 => l_parm_names,
    p_parm_value                => l_parm_values
    ),'$.access_token');

    apex_debug.info('Adobe PDF Services Toke -> %s',l_rest_token);
    --DBMS_OUTPUT.PUT_LINE('Adobe PDF Services Toke -> ' || l_rest_token); -- Outout for SQL clients
end;

The Access Token and Client ID are required for other non-pre-signed requests.

Uploading an Asset

Before we can upload an asset, we must acquire a pre-signed upload URI for the asset. We are not allowed to simple upload the content as with OCI Object Storage, but if you wish to draw parallels, then this could be considered a pre-authorised request.

The Pre-Signed Upload URI is a unique time bound URI. It can be referenced as many times as required before it expires.

So, we will use a two step process to upload the content as an asset.

Getting a Pre-Signed Upload URI

This request prepares a location to allow assets to be uploaded and will return references to the new Asset ID and the upload location.

💡
The Asset ID can be referenced for subsequent requests but I the service expires the Upload URI after 1 hour

Referencing the Access Token and Client ID, we can request the pre-signed Upload URI

declare
    l_client_id       VARCHAR2(1000)  := 'Your Client ID';
    l_rest_token      VARCHAR2(1000)  := 'Your Access Token';
    l_mime_type       VARCHAR2(1000)  := 'Your Assets Content Type';
    l_request_url     VARCHAR2(1000)  := 'https://pdf-services-ue1.adobe.io/assets';
    l_body            JSON_OBJECT_T   := new JSON_OBJECT_T;
    l_response_clob   CLOB;
    l_presigned_url   VARCHAR2(4000);
    l_asset_id        VARCHAR2(4000);
begin


    -- Adobe PDF Services

    -- Get Pre-signed Upload URL    
    l_body.put('mediaType', l_mime_type );

    APEX_DEBUG.INFO('Adobe PDF Services Upload Body: %s', l_body.to_string);

    -- Set additional API parameters
    apex_web_service.g_request_headers.delete();
    apex_web_service.g_request_headers(1).name  := 'x-api-key';
    apex_web_service.g_request_headers(1).value := l_client_id;
    apex_web_service.g_request_headers(2).name  := 'Authorization';
    apex_web_service.g_request_headers(2).value := 'bearer ' || l_rest_token;
    apex_web_service.g_request_headers(3).name  := 'content-type';
    apex_web_service.g_request_headers(3).value := 'application/json';

    l_response_clob := apex_web_service.make_rest_request(
        p_url                       => l_request_url,
        p_http_method               => 'POST',
        p_body                      => l_body.to_string
    );

    APEX_DEBUG.INFO('Adobe PDF Services Asset: %s Upload URL: %s', JSON_VALUE(l_response_clob, '$.assetID') || CHR(10), JSON_VALUE(l_response_clob,'$.uploadUri'));

    l_presigned_url  := JSON_VALUE(l_response_clob,'$.uploadUri');
    l_asset_id       := JSON_VALUE(l_response_clob, '$.assetID');

END;

The Pre-Signed Upload URI and Asset ID are required for the Upload request.

Uploading an Asset with the Pre-Signed Upload URI

Uploading assets no longer requires Authorization Bear Token of the Client ID for the API Key. We simply need to get our Blob and upload it with its correct mime type.

I am storing my object both locally in the database for easy retrieval but this can be any location used for storage.

declare
    l_mime_type      VARCHAR2(1000);
    l_blob_content   BLOB;
    l_response_clob  CLOB;
    l_asset_url      VARCHAR2(1000)  := 'https://pdf-services-ue1.adobe.io/assets';
begin
    -- Retrieve the Blob Content, sample from a local table but this could be on OCI Object Storage
    select mime_type, blob_contents
     into l_mime_type, l_blob_content 
     from YOUR_TABLE_STORING_YOUR_BLOB_CONTENT
    where id = YOUR_CONTENTS_ID;

    -- Adobe PDF Services
    -- Set additional API parameters
    apex_web_service.g_request_headers.delete();
    apex_web_service.g_request_headers(1).name  := 'content-type';
    apex_web_service.g_request_headers(1).value := l_mime_type;

    l_response_clob := apex_web_service.make_rest_request(
        p_url                       => l_asset_url,
        p_http_method               => 'PUT',
        p_body_blob                 => l_blob_content
    );

END;

Funny enough there is no real document about using the pre-signed Upload URL outside of the Getting Started section with its curl example

curl --location -g --request PUT 'https://dcplatformstorageservice-prod-us-east-1.s3-accelerate.amazonaws.com/b37fd583-1ab6-4f49-99ef-d716180b5de4?X-Amz-Security-Token={{Placeholder for X-Amz-Security-Token}}&X-Amz-Algorithm={{Placeholder for X-Amz-Algorithm}}&X-Amz-Date={{Placeholder for X-Amz-Date}}&X-Amz-SignedHeaders={{Placeholder for X-Amz-SignedHeaders}}&X-Amz-Expires={{Placeholder for X-Amz-Expires}}&X-Amz-Credential={{Placeholder for X-Amz-Credential}}&X-Amz-Signature={{Placeholder for X-Amz-Signature}}' \
--header 'Content-Type: application/pdf' \
--data-binary '@{{Placeholder for file path}}'

Normal checks should be added to the code to ensure an acceptable HTTP 200 code was received.

💡
I’ve used the Hope ‘N Pray method, as I have Trust issues, that everything was as it should be 🙏🏾

Creating a Job & Checking its Status

My next request will be to convert this asset if it is not already a PDF. I will submit a Create Job request and then loop/poll to check on its status.

As this is APEX and it has Workflow capability, I am actually orchestrating all of this as a workflow and leveraging the Wait activity. Feel free to add you looping code with sleep methods.

Create PDF Job

The Create PDF API will generate a PDF document from Microsoft Office documents (Word, Excel and PowerPoint) and Image file formats.

It has a few parameters but only the Access Token, Client ID and Asset ID are required.

💡
The job will take some time to retrieve and process the asset. The request will have a unique request ID and URI to check its status.

Let’s submit the request with our current details

DECLARE
    l_client_id       VARCHAR2(1000)  := 'Your Client ID';
    l_rest_token      VARCHAR2(1000)  := 'Your Access Token';
    l_asset_id        VARCHAR2(4000);
    l_response_clob   CLOB;
    l_job_url         VARCHAR2(1000)  := 'https://pdf-services-ue1.adobe.io/operation/createpdf';
    l_body            JSON_OBJECT_T   := new JSON_OBJECT_T;
    l_pdf_job_url     VARCHAR2(4000);
    l_pdf_job_id      VARCHAR2(4000);
BEGIN
    BEGIN
        -- Adobe PDF Services
        -- Submit the Create PDF Job
        l_body.put('assetID', l_asset_id);
        --l_body.put('documentLanguage', l_lang_code);
        -- Lang Code needs to be in ISO format if this is important to you

        -- Set additional API parameters
        apex_web_service.g_request_headers.delete();
        apex_web_service.g_request_headers(1).name  := 'x-api-key';
        apex_web_service.g_request_headers(1).value := l_client_id;
        apex_web_service.g_request_headers(2).name  := 'Authorization';
        apex_web_service.g_request_headers(2).value := 'bearer ' || l_rest_token;
        apex_web_service.g_request_headers(3).name  := 'content-type';
        apex_web_service.g_request_headers(3).value := 'application/json';

        l_response_clob := apex_web_service.make_rest_request(
            p_url                       => l_job_url,
            p_http_method               => 'POST',
            p_body                      => l_body.to_string
        );

        -- Check if everything went okay and get the request id and location to check the job's status.
        IF (apex_web_service.g_status_code = 201) THEN
            -- Check Status
            APEX_DEBUG.INFO('Status Code %s job is created', apex_web_service.g_status_code);
            -- Get Headers for Location and Request ID
            FOR i in 1.. apex_web_service.g_headers.count 
            LOOP
                IF apex_web_service.g_headers(i).name = 'location' 
                    THEN l_pdf_job_url := apex_web_service.g_headers(i).value;
                END IF;
                IF apex_web_service.g_headers(i).name = 'x-request-id' 
                    THEN l_pdf_job_id := apex_web_service.g_headers(i).value;
                END IF;
            END LOOP;

        END IF;

    APEX_DEBUG.INFO('PDF Job URL is: %s ', l_pdf_job_url);
    APEX_DEBUG.INFO('PDF Job Request ID is: %s ', l_pdf_job_id);

    EXCEPTION
    WHEN OTHERS THEN
        BEGIN
            APEX_DEBUG.INFO('Exception while submitting Create PDF Job: %s ', apex_web_service.g_status_code);
            APEX_DEBUG.INFO(CHR(10) || SQLCODE);
            APEX_DEBUG.INFO(CHR(10) || SUBSTR(SQLERRM, 1, 64));
            -- Get Headers for Location and Request ID
            FOR i in 1.. apex_web_service.g_headers.count 
            LOOP
                IF apex_web_service.g_headers(i).name = 'location' 
                    THEN l_pdf_job_url := apex_web_service.g_headers(i).value;
                END IF;
                IF apex_web_service.g_headers(i).name = 'x-request-id' 
                    THEN l_pdf_job_id := apex_web_service.g_headers(i).value;
                END IF;
            END LOOP;
            APEX_DEBUG.INFO('PDF Job URL is: %s ', l_pdf_job_url);
            APEX_DEBUG.INFO('PDF Job Request ID is: %s ', l_pdf_job_id);
        END;
    END;
END;

Checking the Create PDF Job Status

We need to loop/poll the job until is is done. As I mentioned I am using APEX Workflow for my loop execution so I will not have any code examples to show that but a simple sleep will do.

BEGIN
    DBMS_SESSION.SLEEP(60);
END;

The API needs the Access Token, Client ID and Job Request ID.

declare
    l_client_id       VARCHAR2(1000)  := 'Your Client ID';
    l_rest_token      VARCHAR2(1000)  := 'Your Access Token';
    l_pdf_job_url     VARCHAR2(4000); -- Create Job URL from Create PDF Job step
    l_response_clob   CLOB;
    l_pdf_job_status  VARCHAR2(100);
    l_pdf_asset_id    VARCHAR2(4000);
    l_pdf_downld_url  VARCHAR2(4000);
begin
    BEGIN
        -- Adobe PDF Services
        -- Check the Create PDF Job Status
        -- Set additional API parameters
        apex_web_service.g_request_headers.delete();
        apex_web_service.g_request_headers(1).name  := 'x-api-key';
        apex_web_service.g_request_headers(1).value := l_client_id;
        apex_web_service.g_request_headers(2).name  := 'Authorization';
        apex_web_service.g_request_headers(2).value := 'bearer ' || l_rest_token;

        l_response_clob := apex_web_service.make_rest_request(
            p_url                       => l_pdf_job_url,
            p_http_method               => 'GET'
        );

        l_pdf_job_status := JSON_VALUE(l_response_clob,'$.status');
        l_pdf_asset_id   := JSON_VALUE(l_response_clob,'$.asset.assetID');
        l_pdf_downld_url := JSON_VALUE(l_response_clob,'$.asset.downloadUri');

        APEX_DEBUG.INFO('Current Job Status is: %s ', l_pdf_job_status);
        APEX_DEBUG.INFO('Download URL is: %s ', l_pdf_downld_url); 

    EXCEPTION
    WHEN OTHERS THEN
        APEX_DEBUG.INFO('Exception Checking Job Status: %s ', apex_web_service.g_status_code);
    END;

END;

The Job Status has three states

  • inprogress

  • done

  • failed

Evaluate the status to continue, downloading the PDF asset, extracting the contents as JSON, retry the job, or raising an Exception to be handled manually.

Sudo Code

case l_pdf_job_status 
   when 'inprogress' then loop ...
   when 'done' then great!
end case;
💡
In the Check Job Status, I set the l_pdf_asset_id and l_pdf_downld_url which are returned when the job is complete.

Downloading an Asset

The download uses a similar request as when getting the pre-signed Upload URI. The Check Status will return the asset’s download URI a pre-signed Download URI.

💡
This step is only required if you need the actual PDF but not needed to Extract the PDF as JSON.
DECLARE
    l_pdf_downld_url  VARCHAR2(4000);
    l_response        CLOB;
    l_content_length  NUMBER;
    l_content_type    VARCHAR2(4000);

BEGIN
        BEGIN
            -- Get PDF Download URL
            APEX_DEBUG.INFO('PDF Download URL %s' , l_pdf_downld_url);

            -- Set additional API parameters
            apex_web_service.g_request_headers.delete();

            -- Download PDF Adobe PDF Services
            l_blob_content := apex_web_service.make_rest_request_b(
            p_url                       => l_pdf_downld_url,
            p_http_method               => 'GET');

            IF apex_web_service.g_status_code = 200 THEN
                BEGIN
                    APEX_DEBUG.INFO('PDF Downloaded');

                    FOR i IN 1..apex_web_service.g_headers.count
                    LOOP
                        APEX_DEBUG.INFO(apex_web_service.g_headers(i).name || ': ' || apex_web_service.g_headers(i).value ||CHR(10));
                        IF apex_web_service.g_headers(i).name = 'Content-Length' 
                        THEN
                          l_content_length := apex_web_service.g_headers(i).value;
                        END IF;

                        IF apex_web_service.g_headers(i).name = 'Content-Type'
                        THEN
                          l_content_type := apex_web_service.g_headers(i).value;
                        END IF;
                    END LOOP;
                END;
            END IF;

            APEX_DEBUG.INFO('Object Content Length: %s ', l_content_length);
            APEX_DEBUG.INFO('Object Content Type: %s ', l_content_type);

        EXCEPTION
        WHEN OTHERS THEN
            APEX_DEBUG.INFO('Exception Downloading PDF: %s ', apex_web_service.g_status_code);
        END;
END;

Extract PDF as JSON

My next step is actually to extract the content from the document asset. The Extract API extracts content from PDF documents and output it in a structured JSON format, along with tables and figures.

The process is similar to the Create PDF as a job will be submitted to perform the task that will need to be checked.

Create Extract Job

It has a few parameters but only the Access Token, Client ID and Asset ID are required.

💡
The job will take some time to retrieve and process the asset. The request will have a unique request ID and URI to check its status.
DECLARE
    l_client_id       VARCHAR2(1000)  := 'Your Client ID';
    l_rest_token      VARCHAR2(1000)  := 'Your Access Token';
    l_asset_id        VARCHAR2(4000);
    l_response_clob   CLOB;
    l_job_url         VARCHAR2(1000)  := 'https://pdf-services-ue1.adobe.io/operation/extractpdf';
    l_body            JSON_OBJECT_T   := new JSON_OBJECT_T;
    l_extr_job_url    VARCHAR2(4000);
    l_extr_job_id     VARCHAR2(4000);
BEGIN

    -- Adobe PDF Services

    BEGIN

        -- Submit the Create PDF Job

        l_body.put('assetID', :EXTRACT_PDF_ASSET_ID);
        --l_body.put('documentLanguage', l_lang_code);
        -- Lang Code needs to be in ISO format

        -- Set additional API parameters
        apex_web_service.g_request_headers.delete();
        apex_web_service.g_request_headers(1).name  := 'x-api-key';
        apex_web_service.g_request_headers(1).value := l_client_id;
        apex_web_service.g_request_headers(2).name  := 'Authorization';
        apex_web_service.g_request_headers(2).value := 'bearer ' || l_rest_token;
        apex_web_service.g_request_headers(3).name  := 'content-type';
        apex_web_service.g_request_headers(3).value := 'application/json';

        l_response_clob := apex_web_service.make_rest_request(
            p_url                       => l_job_url,
            p_http_method               => 'POST',
            p_body                      => l_body.to_string
        );

        IF (apex_web_service.g_status_code = 201) THEN
            -- Check Status
            APEX_DEBUG.INFO('Status Code %s job is created', apex_web_service.g_status_code);
            -- Get Headers for Location and Request ID
            FOR i in 1.. apex_web_service.g_headers.count 
            LOOP
                IF apex_web_service.g_headers(i).name = 'location' 
                    THEN l_extr_job_url := apex_web_service.g_headers(i).value;
                END IF;
                IF apex_web_service.g_headers(i).name = 'x-request-id' 
                    THEN l_extr_job_id := apex_web_service.g_headers(i).value;
                END IF;
            END LOOP;

        END IF;

    APEX_DEBUG.INFO('PDF Job URL is: %s ', l_extr_job_url);
    APEX_DEBUG.INFO('PDF Job Request ID is: %s ', l_extr_job_id);

    EXCEPTION
    WHEN OTHERS THEN
        BEGIN
            APEX_DEBUG.INFO('Exception submitted Create PDF Job: %s ', apex_web_service.g_status_code);
            APEX_DEBUG.INFO(CHR(10) || SQLCODE);
            APEX_DEBUG.INFO(CHR(10) || SUBSTR(SQLERRM, 1, 64));
            -- Get Headers for Location and Request ID
            FOR i in 1.. apex_web_service.g_headers.count 
            LOOP
                IF apex_web_service.g_headers(i).name = 'location' 
                    THEN l_extr_job_url := apex_web_service.g_headers(i).value;
                END IF;
                IF apex_web_service.g_headers(i).name = 'x-request-id' 
                    THEN l_extr_job_id := apex_web_service.g_headers(i).value;
                END IF;
            END LOOP;
            APEX_DEBUG.INFO('PDF Job URL is: %s ', l_extr_job_url);
            APEX_DEBUG.INFO('PDF Job Request ID is: %s ', l_extr_job_id);
        END;
    END;
END;

Checking the Create PDF Job Status

We need to loop/poll the job until is is done.

The API needs the Access Token, Client ID and Job Request ID.

declare
    l_client_id       VARCHAR2(1000)  := 'Your Client ID';
    l_rest_token      VARCHAR2(1000)  := 'Your Access Token';
    l_extr_job_url    VARCHAR2(4000); -- Create Job URL from Create PDF Job step
    l_response_clob   CLOB;
    l_extr_job_status  VARCHAR2(100);
    l_extr_downld_url  VARCHAR2(4000);
begin
    BEGIN
        -- Adobe PDF Services
        -- Check the Extract PDF Job Status
        -- Set additional API parameters
        apex_web_service.g_request_headers.delete();
        apex_web_service.g_request_headers(1).name  := 'x-api-key';
        apex_web_service.g_request_headers(1).value := l_client_id;
        apex_web_service.g_request_headers(2).name  := 'Authorization';
        apex_web_service.g_request_headers(2).value := 'bearer ' || l_rest_token;

        l_response_clob := apex_web_service.make_rest_request(
            p_url                       => l_extr_job_url,
            p_http_method               => 'GET'
        );

        l_extr_job_status := JSON_VALUE(l_response_clob,'$.status');
        l_extr_downld_url := JSON_VALUE(l_response_clob,'$.content.downloadUri');

        APEX_DEBUG.INFO('Current Job Status is: %s ', l_extr_job_status);
        APEX_DEBUG.INFO('Download URL is: %s ', l_extr_downld_url);

    EXCEPTION
    WHEN OTHERS THEN
        APEX_DEBUG.INFO('Exception: %s ', apex_web_service.g_status_code);
    END;

END;

The Job Status has three states

  • inprogress

  • done

  • failed

Evaluate the status to continue, downloading the extracted contents as JSON, retry the job, or raising an Exception to be handled manually.

💡
In the Check Job Status, I set the asset id and l_extr_downld_url are returned when the job is complete. I do not need the Asset ID, so I am not retrieving it.

Download the Extracted JSON

The last step for this article is downloading the Extracted JSON. We’ve seen how to download from the PDF services already and this is exactly the same.

Download an Asset with a Pre-Signed Download URI

DECLARE
    l_extr_downld_url   VARCHAR2(4000);
    bl_text             BLOB;
    l_content_length    number;
    l_content_type      varchar2(4000);

BEGIN

    -- Set additional API parameters
    apex_web_service.g_request_headers.delete();

    -- Download PDF Adobe PDF Services
    bl_text := apex_web_service.make_rest_request_b(
    p_url                       => l_extr_downld_url,
    p_http_method               => 'GET');

    IF apex_web_service.g_status_code = 200 THEN
        BEGIN
            APEX_DEBUG.INFO('Text Downloaded');

            FOR i IN 1..apex_web_service.g_headers.count
            LOOP
                APEX_DEBUG.INFO(apex_web_service.g_headers(i).name || ': ' || apex_web_service.g_headers(i).value ||CHR(10));
                IF apex_web_service.g_headers(i).name = 'Content-Length' 
                THEN
                  l_content_length := apex_web_service.g_headers(i).value;
                END IF;

                IF apex_web_service.g_headers(i).name = 'Content-Type'
                THEN
                  l_content_type := apex_web_service.g_headers(i).value;
                END IF;
            END LOOP;

            -- Save the Extract some where
        END;
    END IF;

END;

The structuredData.json file with the extracted content & PDF element structure. See the JSON schema for a description of the default output. Review the How-To Extract PDF for more details.

The Oracle database provides many options to store JSON documents, parse and extract the data attributes.

Conclusion

The Adobe PDF Services offer a complete and easy to use set of APIs to work with PDF documents and has strong AI capabilities to extract, insert and manipulate the documents contents.

For my use case the generated PDF matched the Power Point test document and the extract had more than enough details about each page including styling metadata.

All-in-all I had a positive experience using these APIs, though I am not a big fan of passing credentials in the request body and hope this is not susceptible to attacks.

I hope you find this information useful and interesting.

2
Subscribe to my newsletter

Read articles from Sydney Nurse directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sydney Nurse
Sydney Nurse

I work with software but it does not define me and my constant is change and I live a life of evolution. Learning, adapting, forgetting, re-learning, repeating I am not a Developer, I simply use software tools to solve interesting challenges and implement different use cases.