[SOLVED] Splice two different dataframes based on similar value – Stack Overflow

Issue

This Content is from Stack Overflow. Question asked by JoeAA

I have two dataframes with different dimensions. Lets say it´s displacement measurements but the readings are slightly different values and one has more data. Looks like this:

df1
| Index | displacement |
| ——| ————– |
| 1 | 0 |
| 2 | 2 |
| 3 | 4 |
| 4 | 2 |
| 5 | 0 |
df2
| Index | displacement | other data |
| ——| ————– |—-|
| 1 | 0 | 5 |
| 2 | 0.4 | 6 |
| 3 | 0.9 | 7 |
| 4 | 1.3 | 8 |
| 5 | 1.8 | 9 |
| 6 | 2.4 | 10 |
I want to add the “other data” to the first dataframe (df1), by looking for similar displacement value in df2 and asociating displacement value. In this case, the output i want must be similar to this:

df1
| Index | displacement | other data (from df2) |
| ——| ————– |—-|
| 1 | 0 | 5 |
| 2 | 2 | 9 |
And to keep adding the “other data” from df2. I dont know if pd.merge will work and im thinking maybe with a loop till displacement is higher than what im looking from and add the data from the previous row, but df2 has 10 times more rows than df1 and if the displacement measurement is the same as the one from a previous row it may not work. Any help in a cleaner/easier way to do it will be greatly appreciated.



Solution

I used the merge_asof function to find the nearest value base on two DataFrames’ displacement columns, and then filtered the resulting DataFrame by a threshold.

df1['displacement'] =df1['displacement'].astype(float)
df1 = df1.drop_duplicates('displacement', keep='last')

df_out = pd.merge_asof(
    df1.sort_values("displacement"),
    df2.sort_values("displacement").assign(df2_displacement=lambda d: d["displacement"]),
    on="displacement",
    direction="nearest",
)

threshold = .5
dfout1 = df_out[abs(df_out['displacement'] -df_out['df2_displacement'] )< threshold ]

enter image description here


This Question was asked in StackOverflow by JoeAA and Answered by Li Yupeng It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?