Blogarrow Solution Articles

Oracle Fusion(ERP, HCM, SCM) to Databricks: How to Fix the Simba Spark Decimal Precision Issue And Automate Pipelines in Minutes?

In recent times, most organizations using Oracle Fusion (ERP, HCM, SCM) have recognized the need to move their data into a modern analytics and AI platform like Databricks. Extracting this data ensures that decision-makers have comprehensive, cross-functional insights and data ready for AI.

However, teams automating the dataflow from Oracle Fusion to Databricks in any data orchestration platform frequently hit the Simba Spark Decimal Precision error roadblock. The error causes the pipelines to crash mid-run and stops data from landing in Databricks through Simba Spark entirely. The teams are forced to write ad-hoc SQL functions reluctantly to land the data to Databricks on each run, ultimately defeating the purpose of pipeline automation, while ruining productivity and resulting in mounting project costs.

This blog will dive deep into the cause of the Simba Spark Decimal Precision error, and how a simple staging strategy using BI Connector with Microsoft Fabric or Azure Data Factory (ADF) eliminates it for good.

What’s causing the Simba Spark Decimal Precision Problem?

Neither Microsoft Fabric nor ADF supports writing directly into Databricks natively. To bridge this gap, pipelines rely on the Simba Spark ODBC driver, which is also provided by Databricks. It is the standard, official connection method, but it has a well-known limitation with high-precision numeric data.

Oracle Fusion regularly outputs financial, currency, and tax figures with extended decimal scales. When the Simba Spark driver processes these numbers during the Copy Data activity, it triggers a chain reaction that ends in a pipeline crash.

Here is what actually happens under the hood:

Step 1: The “Wider Box” Problem (Expression Widening) Think of your data like a piece of furniture you are moving. In Oracle Fusion, it fits perfectly in its box. But as Spark moves the data through the ODBC layer, it tries to be “helpful” by making the container a bit bigger to ensure nothing spills out. It applies internal precision-widening rules that can accidentally stretch a number beyond its original size. A decimal column that was perfectly sized at precision 38 gets internally stretched to precision 47 or beyond.

Step 2: The Safety Check (Schema Reconciliation) Before the move finishes, Microsoft Fabric performs a safety check. It looks at the entire “container” upfront and compares it against the destination table. Because Spark already stretched the precision in the previous step, a standard number that should be size 38 suddenly looks like a size 47, far larger than what was originally declared.

Step 3: The Crash Databricks has a strict size limit for decimals. It only accepts a maximum precision of 38. When it sees that stretched size 47 number arriving, it refuses to let it in and your pipeline crashes with this error:

org.apache.spark.sql.AnalysisException: Decimal precision exceeds max precision 38.

This completely stops your automation, forcing your engineers to jump in and manually “shrink” or re-cast the data back down just to get it to land.

String First, Decimal Later: The Two-Stage Fix

The root cause of the crash is that the Simba driver processes and moves numeric data in a single step, leaving room for precision widening to occur. The fix is to break that into two clean stages so Spark never touches the decimal values directly.

This staging strategy is built directly into the BI Connector’s Fusion to Databricks pre-built pipeline template for Fabric and ADF, so you do not have to architect it from scratch. Companies can simply use the BI Connector’s pipeline templates and get their Fusion pipelines up and running in Fabric and ADF in minutes!

Stage 1: Extract Oracle Fusion Data into a Staging Table as String

BI Connector extracts data from Oracle Fusion and lands it in a staging table inside Databricks. The numeric columns in this staging table are intentionally defined as String data types instead of Decimals.

By treating the numbers as plain text during transit, the Simba driver has nothing to widen. The values arrive in Databricks exactly as they left Oracle Fusion; intact, untouched, and with no schema conflicts.

Stage 2: Move from Staging to Production via Implicit Casting

Once the data is safely in the staging table, it is moved into your production table, which holds your exact target schema with the correct DECIMAL(precision, scale) definitions. Because this move happens entirely within Databricks, the engine handles the String-to-Decimal conversion itself through implicit casting. No manual code required.

No Manual CAST Queries Needed

In a typical workaround, engineers have to write explicit conversion queries like SELECT CAST(amount AS DECIMAL(38,18)) for every single numeric column. With this staging approach, Databricks handles the type conversion automatically during the production load, keeping your pipeline code clean and maintenance-free.

Getting Started with the Pre-Built Pipeline Template

You do not need to build any of this logic yourself. BI Connector is a dedicated extraction solution built specifically to pull complex data from Oracle Fusion (ERP, HCM, and SCM) reliably and accurately. It is integrated directly inside the pre-built Fabric/ADF pipeline template, handling the Oracle Fusion extraction layer so your pipeline works end-to-end without any custom development.

All you need to do is import the template into your environment, connect it to your BI Connector instance, and run it. The staging table creation, the string-type definitions, and the production load step are all handled for you out of the box.

Step-by-step guide to automate set up the pipeline from Oracle Fusion to Databricks without hitting the Simba Spark Decimal Precision errors

The Benefits of Automating Dataflow from Oracle Fusion to Databricks

Oracle Fusion systems are often the operational backbones for companies using them. However, the data locked inside an ERP, HCM, or SCM application has limited analytical value on its own. Moving it to Databricks in an automated way unlocks several business benefits, including:

  • Unified reporting: Blend Oracle data with other sources like your CRM or market data for a complete business picture, then connect directly to Power BI or Tableau for dashboards.
  • Advanced AI and ML: Feed historical Oracle data into predictive models for supply chain forecasting, revenue prediction, or employee retention analysis.
  • Offload reporting workloads: Running heavy analytical queries directly in Oracle puts strain on your live environment. Databricks absorbs that workload without impacting your ERP performance.

Conclusion

The decimal precision exceeds max precision 38 error is not a bug you can patch around with clever CAST queries. It is a structural problem with how the Simba driver handles numeric data mid-flight. The only reliable fix is to keep Spark away from your decimal values entirely during transit, which is exactly what the two-stage staging strategy does.

By landing Oracle Fusion data as strings first and letting Databricks handle the type conversion internally, you eliminate the precision widening problem at its source. No manual conversion code, no schema conflicts, no pipeline crashes.

The BI Connector pipeline template has this logic built in. Import it, connect it to your Oracle Fusion instance, and your pipeline runs end-to-end without any custom development needed.

Frequently Asked Questions

Does BI Connector replace the Simba Spark ODBC driver?

No. The Simba driver is still required by ADF and Fabric to connect to Databricks. BI Connector works alongside it, handling the extraction from Oracle Fusion and formatting the data in a way that prevents the driver from triggering precision errors during the copy activity.

What exactly is implicit casting in this context?

When data moves from your staging table into the production table, Databricks automatically converts the String values into the Decimal type defined in the destination schema. It reads the column definition and handles the conversion without you writing a single line of conversion code.

Does this work for Oracle ERP, HCM, and SCM data?

Yes. BI Connector is built to extract data from all Oracle Fusion Cloud modules. The staging strategy applies consistently regardless of which module your data is coming from.

What if a String value in staging cannot be cast to the target Decimal type?

This would only happen if the source data itself is malformed. BI Connector’s extraction ensures that numeric fields from Oracle Fusion are passed through cleanly, so implicit casting in Databricks succeeds reliably under normal conditions.

Subscribe to Our Blog

Stay up to date with the latest news and data visualisation
tips from the world of BI Connector

© 2026 Guidanz
  |  
  |