Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Unknown
Fix Version/s: 1.9.0
Affects Version/s: None
Component/s: Source
Labels:
None

Documentation Changes:
Needed

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Schema inference for documents nested in arrays falls back to "string" when any difference is detected in the schemas for the nested documents. This is necessary because Kafka schemas can not handle arrays with elements of different type. But we can improve the schema inference to detect some cases where the schemas for the nested documents are actually compatible:

Where the field is present in one document but missing in another
Where the field is present in one document but null in another
Where the field types conflict (in this case we can push the conflict down to the schema for the field)
Where the field is an array with elements of some type in one document but an empty array in another

is related to

SPARK-375 failed to infer schema of array field if there is data with empty array value

Closed

related to

KAFKA-349 Schema inference fails with an Array containing nested Structs

Closed

KAFKA-175 Inferring schema should support variable types for uses with Json with Schema.

Closed

Assignee:: Jeffrey Yemin
Reporter:: Jeffrey Yemin
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Jan 03 2023 02:44:28 PM UTC
Updated:: Oct 28 2023 10:46:19 AM UTC
Resolved:: Jan 09 2023 03:21:50 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates