Add Multivariate Distance Covariance metric and the corresponding test of independence#62
Add Multivariate Distance Covariance metric and the corresponding test of independence#62Palash123-4 wants to merge 23 commits intovnmabus:developfrom
Conversation
vnmabus
left a comment
There was a problem hiding this comment.
Thank you for your effort making this PR! I did not check the math yet (I assume you implemented it well), but I have several code suggestions.
dcor/_dcor.py
Outdated
| np.bool = np.bool_ | ||
|
|
||
|
|
||
| def gamma_ratio(p): return np.exp(gammaln((p+1)/2) - gammaln(p/2)) # For Calculating C_p and C_q |
There was a problem hiding this comment.
- The function body needs to be in its own line.
- Functions should be type-annotated.
There was a problem hiding this comment.
I have added the type-annotation
There was a problem hiding this comment.
Maybe I was not very clear: I meant to use Python type hints, not to add the types in the docstring. Types in a docstring cannot be easily verified, while type hints can.
based distance covariance
|
I found some issues regarding the cyclic recalling of rowwise module in _dcor.py. Maybe you can help me to solve it. |
dcor/_dcor.py
Outdated
| return X_new | ||
|
|
||
|
|
||
| def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ='mergesort'): |
There was a problem hiding this comment.
| def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ='mergesort'): | |
| def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ="mergesort"): |
Please, add double quotes everywhere.
dcor/_dcor.py
Outdated
|
|
||
|
|
||
| """ | ||
| A Statistically and Numerically Efficient Independence Test Based on |
There was a problem hiding this comment.
The docstring needs to be inside the corresponding function.
dcor/_dcor.py
Outdated
|
|
||
| References | ||
| ---------- | ||
| .. bibliography:: ../refs.bib |
There was a problem hiding this comment.
Use :footcite: and .. footbibliography instead, as it is easier to maintain.
There was a problem hiding this comment.
I would really appreciate it if you kindly guide me how to do that, I did search this names on the internet, but I am not quite certain how to do it. It would be really nice if you guide me step-by-step process.
There was a problem hiding this comment.
Just use :footcite: instead of :cite: inside the text to place the footnote, and .. footbibliography:: instead of .. biblioggraphy::. You do not need to specify the .bib file, as it is done in the global config.
You can use this docstring as an example:
https://github.com/GAA-UAM/scikit-fda/blob/3eb8ace81220011559df83be0ffb9154a11fb001/skfda/exploratory/stats/_fisher_rao.py#L66-L99
There was a problem hiding this comment.
I implemented the exact format that you have mentioned
dcor/_dcor.py
Outdated
| np.bool = np.bool_ | ||
|
|
||
|
|
||
| def gamma_ratio(p): return np.exp(gammaln((p+1)/2) - gammaln(p/2)) # For Calculating C_p and C_q |
There was a problem hiding this comment.
Maybe I was not very clear: I meant to use Python type hints, not to add the types in the docstring. Types in a docstring cannot be easily verified, while type hints can.
| @njit(fastmath=True, parallel=True, cache=True) | ||
| def dist_sum(X): # It is nothing but the parallelized version of pdist function----------------------- | ||
| ''' | ||
| Parameters |
There was a problem hiding this comment.
You closed the conversation without changing it, please don't do that.
dcor/_dcor.py
Outdated
| ''' | ||
|
|
||
| # X_std = multivariate_normal.rvs( np.zeros(dim_X), np.identity(dim_X), size = 1) | ||
| X_std = np.random.standard_normal(dim_X) |
There was a problem hiding this comment.
I know what random projections are. What I meant, is that it should have a random_state argument, similar to how distance_covariance_test works. This way, it is still possible to generate random data, but you can also pass a fixed seed to obtain reproducible results.
dcor/_dcor.py
Outdated
| ''' | ||
|
|
||
| # X_std = multivariate_normal.rvs( np.zeros(dim_X), np.identity(dim_X), size = 1) | ||
| X_std = np.random.standard_normal(dim_X) |
There was a problem hiding this comment.
Again, do not resolve conversations until the comments have been addressed.
dcor/_dcor.py
Outdated
| X_std = np.random.standard_normal(dim_X) | ||
| X_norm = np.sum([i**2 for i in X_std]) | ||
| U_sphere = np.array(X_std / np.sqrt(X_norm)) | ||
| if dim_X > 1: |
There was a problem hiding this comment.
Maybe, but does the general code not work in that case?
|
Note that the last commit shows a lot of whitespace changes. Please, make sure that your editor is not changing the Unix newlines (LF, or "\n") to Windows ones (CRLF, or "\r\n"). |
I use actually "spyder", not sure why it did happen. But I can use "autopipe" for better indentation if it's necessary. |
Removed White space
I have removed the white space. |
As we discussed, I have added the code corresponding to the random projection based multivariate distance covariance, as discussed in https://www.frontiersin.org/articles/10.3389/fams.2021.779841/full.
I spent some time regarding how to use the gamma function in the best possible way so that the correct p_value can be calculated. (One need to install the "mpmath" python module for gamma_cdf function, for details: https://mpmath.org/doc/current/functions/expintegrals.html)
Please take a look and let me know whether it is working fine or not.