Issue
This Content is from Stack Overflow. Question asked by daniel guo
I have two Pyarrow Tables and want to join both.
A.join(
right_table=B, keys="A_id", right_keys="B_id"
)
Now I got the following error:
{ArrowInvalid} Incompatible data types for corresponding join field keys: FieldRef.Name(A_id) of type int8 and FieldRef.Name(B_id) of type int16
What is the preferred way to solve this issue?
I did not find a way to cast one column to either int8 or int16 in pyarrow Table.
Thanks
Solution
you need to change field type of one of your tables.
How to change ‘A_id’ field for your table A
# change type of 'A_id'
schema = A.schema
for num, field in enumerate(schema):
if field.name == 'A_id':
new_field = field.with_type(pa.int16()) # return a copy of field with new type
schema = schema.remove(num) # remove old field
schema = schema.insert(num, new_field) # add new field
A = A.cast(target_schema=schema) # update new schema to Table A
# join tables
A.join(
right_table=B, keys="A_id", right_keys="B_id"
)
This Question was asked in StackOverflow by daniel guo and Answered by Lucas M. Uriarte It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.