dimanche 26 août 2018

Best practice for threshold argument design in an svd function

Sorry for the confusing title but I really can't come up with anything better. I wish to do an SVD decomposition to a matrix and find out some of its largest singular values. To do this, I can write something like this (pseudocode in python):

def singular_values(matrix, num):
    u, s, v = svd_lib.svd(matrix)
    return s[:num]

in which num is an integer, indicating the number of singular values I want. Another approach is:

def singular_values(matrix, thresh):
    u, s, v = svd_lib.svd(matrix)
    num = 0
    for s_value in s:
        thresh -= s_value
        num += 1
        if thresh <= 0:
            break
    return s[:num]

In which thresh is a float from 0 to 1, indicating the proportion of singular values I want. If I want both functions, a good plan (Plan A) seems to be writing the 2 functions separately as I had done here. But it would be rather painful If I want to decide which function to use in the caller function:

def singular_values_caller(matrix, num=None, thresh=None):
    if (num is None and thresh is None) or (num is not None and thresh is not None):
        raise ValueError
    if num is not None:
        return singular_values(matrix, num)
    else:
        return singular_values(matrix, thresh)

A better way to do this might be rewriting singular_values (Plan B):

def singular_values(matrix, thresh):
    u, s, v = svd_lib.svd(matrix)
    if 0 < thresh < 1:
        num = 0
        for s_value in s:
            thresh -= s_value
            num += 1
            if thresh <= 0:
                break
    else:
        num = thresh
    return s[:num]

In this way, the caller function should be easier to write but the argument thresh has two meanings which I also find uncomfortable.

I wonder which of the plan is better? If both are not sound plans, what can I do to improve the code in order to make it easier to read/write/use/modify?

Thank you all for reading this question. It should be quite common because svd is so widely used and my two needs here are also typical.

Aucun commentaire:

Enregistrer un commentaire