How to perform an outer product with custom function (pandas/numpy)?


This Content is from Stack Overflow. Question asked by P i

My dataframe has N rows.
I have M centroids. Each centroid is the same shape as a dataframe-row.

I need to create a Nrows by Mcols matrix, where the m-th column is created by applying the m-th centroid to the dataframe.

My solution involves pre-creating the output matrix and filling it one column at a time as we manually iterate over centroids.

It feels clumsy. But I can’t see clearly how to do it ‘properly’.

    def getDistanceMatrix(df, centroids):
        distanceMatrix = np.zeros((len(df), len(centroids)))

        distFunc = lambda centroid, row: sum(centroid != row)

        iCentroid = 0
        for _, centroid in centroids.iterrows():
            distanceMatrix[:, iCentroid] = df.apply(
                lambda row: distFunc(centroid, row),
            iCentroid += 1

        return distanceMatrix

    distanceMatrix = getDistanceMatrix(df, centroids)

It feels like some kind of outer-product-with-a-custom-function.

What’s a good way to write this?


This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.

This Question and Answer are collected from stackoverflow and tested by JTuto community, is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?