由买买提看人间百态

topics

全部话题 - 话题: correlation
1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)
e*n
发帖数: 1511
1
【 以下文字转载自 Stock 讨论区 】
发信人: mitbbs2020 (bbc), 信区: Stock
标 题: 【Stragety论坛】Trading with correlations.
发信站: BBS 未名空间站 (Sun Jul 25 16:27:03 2010, 美东)
trader1688上也有人提过pair trading,并有人表示怀疑。
这个idea是80年代摩根斯坦利的人提出来的idea,然后finance行业的人又比较liquid
,跳几次槽,这个idea就逐渐流行起来了。
基本的想法是,high correlated的two stocks,spread比较稳定,假如某个时刻出现
了比较背离average的情况,基本上是可以认为一个overprice,一个underprice,然
后long那个underprice的,short overprice的symbol,以期spread恢复至正常。
这里overprice和underprice都是相对而言,加入两个symbol都涨了,期望的是long那
个涨的比short的那个幅度更大,spread还是可以按照... 阅读全帖
R******t
发帖数: 2648
2
来自主题: Quant版 - empirical correlation
我以为你问empirical correlation matrix都能怎么得到
empirical correlation就是通过实际的data得到的correlation,与其相对应的是true
correlation matrix
比如说,a portflio of N assets with weight w_i on the ith asset, then the
variance of the portfolio is
\sum w_i\sigma_iC_{ij}\sigma_jw_j,
这里用到的C是true correlation matrix,在实际中是不知道的,但是我们可以用历史
数据得到一个correlation matrix,这个就叫empirical correlation matrix
w**********y
发帖数: 1691
3
如果非要讨论到这个filtration啊和制造一个'correlation process'啊,这玩意就更麻
烦了,大家做矿工,又不是要研究纯理论,都是听说些名词就开始瞎掰活.差不多别说错就
行了..
correlation在统计中,就是基于两个random variable的定义,不是关于随机过程.
你可以定义A_t和B_t的correlation,当然也可以定义A_n和B_m的correlation,与所谓的
什么filtration没有任何关系.
filtration的数学含义是: a collection of nested sigma-algebra.
常识的情况下,我们是assume你要算的是A_t和B_t的correlation.那么,这一切的定义都
是基于概率空间的:
这个概率空间是(\Omega, F_t, P),粗略的说,这个\Omega 是 [0,t] \cross R 的函数
空间上一个集合(就去理解成所有可能的从0到t走出来的曲线的集合吧),然后F_t就是定
义在\Omega上的类似于Borel集(空交并补)的sigma algebra.你这个A_t和B_t完... 阅读全帖
u******3
发帖数: 11
4
来自主题: Statistics版 - correlation among indepedent variables
The second paragraph is from a research paper. Again, a question about the
correlation among indepedent variables:
Is collinearity diagnostics part of correlation matrix? shall we say
correlation matrix alleviate (reduce) correlation or simply detect
correlation among independent variables and the we use collinearity
diagnotics to see whether multicollinearity is within acceptable level? (my
questions)
Table 5 reports the correlation matrix for the independent variables. The
following discussion
d******o
发帖数: 59
5
来自主题: Statistics版 - correlation among indepedent variables
The second paragraph is from a research paper. Again, a question about the
correlation among indepedent variables:
Is collinearity diagnostics part of correlation matrix? shall we say
correlation matrix alleviate (reduce) correlation or simply detect
correlation among independent variables and the we use collinearity
diagnotics to see whether multicollinearity is within acceptable level? (my
questions)
Correlation matrix is for diagnosing collinearity. However, it is not a good
statistical metho
w****a
发帖数: 1623
6
来自主题: Statistics版 - 统计问题请教(spurious correlation)
谢谢你的回复,我在网上查了spurious correlation的例子,我能明白这些例子。我也
查了一些材料,包括网上的一些书等。大多数给的例子是C-> A, C - > B,那么把A
和B做相关分析的话是spurious。但是也有的说如果Y这一边本来就是X的一个函数,
那么这个correlation是spurious的。我想重量本来就等于密度乘体积,所以我用重量
和体积做回归,相当于用密度乘体积与体积做回归。方程的两边都有体积,那不就得到
spurious correlation了?
我是做环境的,我问这个的主要原因是我做了一些污染物的量与河流流量的回归,后来
在网上查了这些类似的文献,很多都说污染物的量与河流流量之间的correlation是
spurious correlation,原因是污染物的量等于浓度乘流量,这样方程的两边都有流
量项,所以是spurious correlation。有的建议用浓度和流量做回归,可以一是由机
理我已经知道浓度和流量没有简单的相关关系,二是浓度不也等于污染物的量除流量吗
,流量不也出现在方程的两边了吗?
P****D
发帖数: 11146
7
来自主题: Statistics版 - 发包子求问,correlation
妾晕……二楼要砍掉重练!!!把基本概念搞清楚!!!
correlation的定义:
http://en.wikipedia.org/wiki/Correlation
凡是不独立的,就叫有correlation,不管关系是线性还是非线性。平时说的线性关系
,那是Pearson's correlation,只是correlation的一种。如果用Spearman's测量楼主
的A和B,会得出rho=1。除了这两种方法之外,也还有其他描述correlation的tests。
至于“r2=0.32,P=ns”,ns=not significant,这么写的人也该砍掉。现在通行的方法
是不管p值是否显著,都要报告具体数字。楼主千万别学他。
a******1
发帖数: 201
8
He does not have 10,000 data point from one sample, what he has is actually
one data point of 10,000 dimensions. Although I said that he is the only one
who can determine the "correlation coefficient" he calculated is the way to
characterize the distribution of his data. I have come to the conclusion
that his "correlation coefficient" does not mean much, or he just
misunderstood the concept of correlation coefficient. As a simple example
for him to understand, let's say we have two people, and w... 阅读全帖
m********0
发帖数: 2717
9
然后correlation的关系在很长一段时间内保持稳定,这也是因为他们在同一个sector,
如果不是在一个sector,也有可能有非常高的correlation。譬如DISCA和AXP,(我之前
script不限定从一个sector选取pair,也是看看会有什么样的发现),没记错的话,这
两个
symbol correlation在99%以上。
其中的原因基本上可以通过share holder来理解,这两大股票,top 5的share holding
机构有
三个是重合的。
但从这个意义上建立的correlation我个人觉得不是很靠谱,譬如下图,BB的数据就说
明,这两个
symbol的correlation不稳定。

liquid
那个
m********0
发帖数: 2717
10
我看了几本关于pair trading的书,感觉这本是理论上比较自洽的。
有些真是浪费时间。wiley trading series也有不少狗皮膏药模样的。
不过他说的那种risk arbitrage机会已经很少了。最近好像就PALM?
我的理解跟你一模一样,hedge掉的部分应该是correlated的,而且
我在原贴中指出过为什么要correlated,因为这样不用一直adjust
hedge ratio,这对散户是不太好操作的。
如果不correlate,不可能同时做到保持market的neutral,而且
不破坏没hegde掉部分的distribution(adjustment hedge ratio
会破坏后面的统计)。
而price本身correlated,会推断hedge部分会correlated。假设
common trend is more significant than specialized trend。
s********k
发帖数: 6180
11
【 以下文字转载自 Stock 讨论区 】
发信人: silverhawk (silverhawk), 信区: Stock
标 题: 请教有没有可以实时计算时间学列correlation的近似算法? (转载)
发信站: BBS 未名空间站 (Fri May 21 18:39:21 2010, 美东)
发信人: silverhawk (silverhawk), 信区: EE
标 题: 请教有没有可以实时计算时间学列correlation的近似算法?
发信站: BBS 未名空间站 (Wed May 19 20:42:59 2010, 美东)
假设我有两个时间序列x(i),y(i),i=1:n. 一般的correlation计算要等到全部n个采样
完成之后再计算。我现在想用一种online的方法来实时计算,从i=2开始就开始计算两
个序列的相关性,每次时间序列有一个新的采样correlation更新一次,当然由于采样
不完全,所以允许correlation的误差存在,但是希望大体走势应该是越来越趋近最后
的准确值,不知道有没有这样的实时算法存在。谢谢
g*****1
发帖数: 18
12
Triangle rule.
A-B with an angle of alpha, correlation = cos(alpha)
B-C with an angle of beta, correlation = cos(beta)
A-C with an angle of 180-alpha-beta, correlation = cos(180-alpha-beta)
Correlation between A and C can be explicitly expressed in terms of other
two correlations.
S********a
发帖数: 359
13
下面output最后三行correlation,怎么解释啊,time是fixed effect, id 是random effect, 那么最后的correlation是time 和intercept的correlation?
> #compound symmetry
> fit.gls.cs <- gls(y~time, data=willett, corr=corCompSymm(, form=~time | id
))
> summary(fit.gls.cs)
Generalized least squares fit by REML
Model: y ~ time
Data: willett
AIC BIC logLik
1308.340 1320.049 -650.1698
Correlation Structure: Compound symmetry
Formula: ~time | id
Parameter estimate(s):
Rho
0.7061649
Coefficients:
... 阅读全帖
w******8
发帖数: 59
14
来自主题: Statistics版 - longitudinal, correlation, useless?
In a clinical trial, we measured the quantity of interest at week 0, 4, 8
and 12. Our primary end point is the measure at week 8. The PI would like to
see if the week 4 measure can predict week 8 measure. So I did a plot of
week 8 change score vs. week 4 change score. Of course the two change scores
are correlated. I have two questions that I would like to discuss with you:
1. Since we are not building a model to do prediction, shall I just report
correlation coefficient along with the p value? ... 阅读全帖
s********r
发帖数: 297
15
来自主题: Statistics版 - 请问一个correlation和regression的问题
大家好,我是个新手,工作中刚接到一个project,是关于找是否存在correlation的。
之前的一个analyst用 proc means和proc freq总结出的报告是两个variables从图象或
者是从summarized data table看不出有relationship。现在需要我用statistical way
去找是不是correlated。我本身不是纯统计的,所以能想到的就只有 proc corr的
peason correlation看-1到1之间的correlation coefficient和p value。另外想到用
proc reg 做个regression。但是我之前都没做过类似的project,请问各位一般类似的
问题都用什么方法找correlation啊,还有run完proc还需要检查什么吗?谢谢大家的指
点,谢谢!
p.s 只有两个variable,不是mulvariate
s********k
发帖数: 6180
16
【 以下文字转载自 EE 讨论区 】
发信人: silverhawk (silverhawk), 信区: EE
标 题: 请教有没有可以实时计算时间学列correlation的近似算法?
发信站: BBS 未名空间站 (Wed May 19 20:42:59 2010, 美东)
假设我有两个时间序列x(i),y(i),i=1:n. 一般的correlation计算要等到全部n个采样
完成之后再计算。我现在想用一种online的方法来实时计算,从i=2开始就开始计算两
个序列的相关性,每次时间序列有一个新的采样correlation更新一次,当然由于采样
不完全,所以允许correlation的误差存在,但是希望大体走势应该是越来越趋近最后
的准确值,不知道有没有这样的实时算法存在。谢谢
s********k
发帖数: 6180
17
但是只采集部分数据的correlation会不会和采集完所有时间点的数据之后的
correlation相差很大?或者说时间序列需要满足什么条件,利用部分采集数据的
correlation才能大概代表最后的真实correlation?stationary或者是self similar?
w**********m
发帖数: 82
18
【 以下文字转载自 Statistics 讨论区 】
发信人: wirelesscomm (fanls), 信区: Statistics
标 题: 问一个correlation matrices估计的问题。
发信站: BBS 未名空间站 (Fri Jun 8 21:44:29 2007), 站内
发信人: wirelesscomm (fanls), 信区: EE
标 题: 问一个correlation matrices估计的问题。
发信站: BBS 未名空间站 (Fri Jun 8 21:44:11 2007), 转信
比如要估计一个随机变量X的correlation matrices,
X是3×1的vector,
可以用average [X*X']来估计这个3×3的correlation matrices。
现在的问题是,需要多少点的采样才能让估计准确。
有没有这方面的文献。
thanks
l******9
发帖数: 579
19
【 以下文字转载自 Statistics 讨论区 】
发信人: light009 (light009), 信区: Statistics
标 题: data clustering by vector correlation distance
发信站: BBS 未名空间站 (Wed Feb 26 11:17:21 2014, 美东)
I am working on data analysis.
Given a group of data vectors, each of them has the same dimension. Each
element in a vector is a floating point number.
V1 [ , , , … ]
V2[ , , , … ]
...
Vn [ , , , … ]
Suppose that each vector has M numbers. M can be 10000.
n can be 200.
I need to find out how to partition the n vector... 阅读全帖
a*****n
发帖数: 20
20
In digital image processing, it is equivalent to removing the DC part of the
correlation images, which means ignoring the image background. Correlation
of two signals can be an evaluation of the similarity according to
their correlation peak values. However, if the two signals are similar in
shape but have completely different quantitative values, their correlation
will still turn out to be a peak, but you cannot judge the similarity
according to the peak value. This is because they are not unif
a*********r
发帖数: 139
21
First, how can you define the correlation between two stochastic process?
We can only talk about the correlation between two random variables B_1(t)
and B_2(t) for a fixed t given that B_1 and B_2 are two stochastic process.
That's why I say it's ill-posed.
You miss the point! When we talk about martingales, we must specify the
filtration. However, when we talk about correlation between B_1(t) and B_2(t
) (note t is fixed), it has nothing to do with filtration!
I don't want to be harsh. But you ... 阅读全帖
w**********y
发帖数: 1691
22
回到你这个ill-posed(?)题目啊:
如果按照你给的这个W_t, t \in [0,1] 和一个 B_t, t \in (1,\infty),然后要定义
correlation..那么这个correlation 只能定义成 corr (W_t1, B_t2), t分别在各自的
范围里.
所以你的correlation当然取决于你选的时间点.而不是什么随机过程的correlation..
俺吃饭去了.草草写了点看法,难免有错.
说实话,被timol说俺的例子错了之后,我还真是认认真真仔仔细细的想了一下才敢确保
自己没错的,然后才挖了个坑给你...
玩poker的时候,俺最喜欢也是最需要的就是9个人中有一个开始觉得一切显而易见,然后
开始冲动的时候..去年Xmas的时候,俺就是这么5分钟从200块翻成了1000块的..共勉共
勉..
l******9
发帖数: 579
23
【 以下文字转载自 Statistics 讨论区 】
发信人: light009 (light009), 信区: Statistics
标 题: data clustering by vector correlation distance
发信站: BBS 未名空间站 (Wed Feb 26 11:17:21 2014, 美东)
I am working on data analysis.
Given a group of data vectors, each of them has the same dimension. Each
element in a vector is a floating point number.
V1 [ , , , … ]
V2[ , , , … ]
...
Vn [ , , , … ]
Suppose that each vector has M numbers. M can be 10000.
n can be 200.
I need to find out how to partition the n vector... 阅读全帖
a***m
发帖数: 74
24
In a multivariate linear regression model, if the observations are
correlated with a known pairwise correlation coefficient,
how would the parameter and standard error estimate deviate from the same
model with independent observations?
If the correlation is very small, are there ways to correct the parameter
and standard error obtained from the model with independent observations to
estimate the parameter and standard error for the model with correlated
observations?
Any help is appreciated!
d*******1
发帖数: 293
25
Hi,
I need to analyze some factors, but found some factors are highly correlated
. According to multicollinearity, I average them as one factor. Is there any
other better way to deal with it?
In addition, for these high correlated variables, since one has negative
correlation with others, I used 1/factor to revese them to positive
correlation.
Thanks.
c*******7
发帖数: 2506
26
【 以下文字转载自 JobHunting 讨论区 】
发信人: bergman (烤饼男人,不是伯格曼,更不是褒曼), 信区: JobHunting
标 题: A simple problem on correlation coefficient
发信站: BBS 未名空间站 (Thu Jul 14 12:08:48 2011, 美东)
A, B and C are three variables. Suppose we know the correlation coefficient
between A and B is x, and correlation coefficient between B and C is y. I
remember that the correlation coefficient between A and C must be in a
certain range. But I can not find the formula. Can anybody tell me? Thanks.
a***r
发帖数: 420
27
来自主题: Statistics版 - Residual and Partial Correlation
做助教,本来只是想做个教学demo,show一下confounding 和 partial correlation的
概念,结果纠结了。。。
我simulate了三个variable: outcome, var1, var2。其中outcome是binary; var1 有
三个category 1/2/3; var2 continuous。三者之间都有correlation,我的目的是show
var2对var1和outcome之间association的confounding
R code如下:
> table(outcome)
outcome
0 1
52 48
> table(var1)
var1
0 1 2
19 44 37
> summary(var2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002583 0.053960 0.105000 0.122800 0.184200 0.392700
> cor(outcome,var1)
[1] 0.2854862
> cor(outcome,var... 阅读全帖
s******h
发帖数: 539
28
你这个问题其实很好。对于Pearson Correlation, 做一般的monotone transformation
是会影响的。 你可以这样想,做transformation之前Pearson Correlation ~ Corr(X,
Y), 做了之后,say f, Pearson Correlation ~ Corr(f(X), f(Y)),你可以推推看这
两者差别会有多大。
其实也有其他的correlation measurement是independent of monotone
transformation的, e.g.
http://en.wikipedia.org/wiki/Spearman's_rank_correlation_c
s******o
发帖数: 656
29
来自主题: Statistics版 - weighted correlation的问题
各路高人帮忙看下:
我现在在做一个包括很多国家的国际研究,碰到一个问题是感兴趣的单变量分析
correlation的系数和多元回归regression的系数方向全是相反的。regression的系数
是符合我的hypothesis的。
大概分析了一下,原因可能是美国的数据占绝大多数:总共有34000条左右的记录,美
国的数据占了超过30000条,其他国家加起来3000多条。
regression里我按国家调整了权重(weighted ols regression),做出来结果是感兴
趣的系数跟预期一样。但是pearson correlation的系数就全是反的,如果拿掉美国的
数据就没有这个问题。
我的问题是我想保留美国的数据(因为去掉美国的数据代表性就变弱了),同时让
correlation的系数和weighted regression方向一致。网上搜了一下似乎是有个
weighted correlation,但是不知道这么用在这里合不合适?用sas或者stata怎么实现
s********1
发帖数: 235
30
主要就是pearson correlation的统计检测的问题。
用的方法是这个:
correlation calculation:
http://www.statisticshowto.com/articles/what-is-the-pearson-cor
t-test:
http://www.vassarstats.net/textbook/ch4apx.html
现实情况里,有两个vectors, 10000 X 1 dimension. 由于 n 很大(n=10000),所以当
correlation 哪怕是一个不高的值 like 0.2, t-test 的 t都比较大,p-value 都很小
,就是说统计显著性很强,这就说不清了。看correlation 的值,感觉,这两个
vector之间没有很强的线性关系,但是看p-value 又是 statistical significant 的
,感觉,他们之间又很强的线性关系。到底怎样interpret 呢?两个 vectors 之间到
底是否存在很强的线性关系?
多谢!
c*****n
发帖数: 46
31
你的t-test是测两组数据之间correlation是否为零吧? 你有一万个数,算出来
correlation的值等于0.2,自然correlation为零的可能性很小了啊。
数据之间绝对不相关挺难的,一般弱相关也就够用了。所以可以考虑算一算比如
correlation<0.2 相对应的p-value。当然这好像也不是经典统计里的常见的做法。
R*****0
发帖数: 146
32
To LZ, it seems IMO the process of drawing samples (either 10 or 8) is not
independent each other. An average correlation of 0.9 means you have a lot
of sample pairs with very high (close to 1) correlations, which is not
likely to happen if you draw samples randomly. Anyway, it still does not
tell you much about the sample variances.
For example, if you draw 8 samples from population 2. Now if you
multiply all those data by 2, you will get the same 28 correlations.
However, the variances will be... 阅读全帖
t**********y
发帖数: 374
33
看来问题比我想像的复杂,
1.这里population 1, 雌性个体, sample 是10个雌性个体,测的是10,000个基因表
达值;计算的是两两雌性个体之间这10000个基因表达的correlation coefficient, 所
以总共45个,平均0.41
2 这里population 2, 雄性个体, sample 是10个雄性个体,测的是同样的10,000个
基因表达值;然后计算的是两两雄性个体之间这10000个基因表达的correlation
coefficient, 所以总共28个,平均0.9
3.我想说的是雄性个体之间基因表达差异小于雌性个体
看了这么多讨论,我在想用U-test可以吗?
45个correlation coefficients in female and 28 correlation coefficients in
male, apparently are not normally distributed by histogram

when
e*n
发帖数: 1511
34
【 以下文字转载自 Stock 讨论区 】
发信人: mitbbs2020 (bbc), 信区: Stock
标 题: Re: 【Stragety论坛】Trading with correlations.
发信站: BBS 未名空间站 (Sun Jul 25 19:52:04 2010, 美东)
用option当然可以,但不是必须,
用option可以得到更高的leverage,
但是缺点是散户量小,不容易实现market neutral。
比如1.37的hedge ratio,你要long 137和short 100个才能实现。
13和10就不那么好,而且不容易adjust,看附图就知道不怎么neutral的情况。
说stock不便于对冲,这点我不懂你什么意思。
这本身就是market neutral的strategy。
我测试过DISCA和AXP(paper account),追踪了至少2个月,涨跌幅从来没超过3%的,
但是这反过来说明这不是一对好的pair。
别的pair,没有任何leverage,一年可以做到100%以上的也有,如果我实际做stock到
这个%,
已经很满意了... 阅读全帖
d*****u
发帖数: 17243
35
这种研究在美国和中国大概都批不了
他们研究了欧洲常见的Y-chromosome halogroups与智商的统计关系
然后得出A组(包含R,N,I等)比B组(J,E,T,L)智商高
https://lesacreduprintemps19.files.wordpress.com/2012/05/haprinderm.pdf
5. Discussion
Based on our model, it appears that national cognitive
ability is confounded with the general development of
society. This is shown by the high correlations with HDI and
the observation that in three analyses HDI accounted for the
largest mean share of the variance in national cognitive
ability. The mean across two regres... 阅读全帖
b*****n
发帖数: 143
36
来自主题: JobHunting版 - A simple problem on correlation coefficient
A, B and C are three variables. Suppose we know the correlation coefficient
between A and B is x, and correlation coefficient between B and C is y. I
remember that the correlation coefficient between A and C must be in a
certain range. But I can not find the formula. Can anybody tell me? Thanks.
w********s
发帖数: 59
37
You wanted to compute the cross-correlation sequence between two time series
? Or normalized cross-correlation sequence? This should be easily done but
you'd better show exactly how you compute the correlation originally ...
t*****t
发帖数: 244
38
如果已知这两个时间序列都各自由某个process/model来产生,那correlation理论上也
是知道的,
具体取决于underlying models。
如果你就想通过测量实时的样本数据来动态测量correlation,那只有用expanding
window或者
rolling window去动态测量correlaton。当然怎么测量取决于你的选择,sample corr,
robust corr, 或者要不要考虑time varying volatility,等等。
correlation是一个抽象概念,并不存在一个准确值。当然你可以用forward realized
corr作为
benchmark来衡量你的算法的优略。
m********0
发帖数: 2717
39
I think again and I think correlation IS the screener。
cointegration model implies the common trends is linear related
which means perfectly correlated(+/- 1 correlation).
otherwise, there is no way to hedge in a stationary sense.
thank you for letting me think it again:) it has been quite some time
since I embrace options and forget stocks.
m********0
发帖数: 2717
40
thank you, waiting for your clarification.
I did not mean the price itself would be (perfectly) correlated.
from "pair trading - quantitative methods and analysis"
by Ganapathy Vidyamurthy P.88
Inference 1: In a cointegrated system with two time series, the
innovations
sequences derived from the common trend components must be perfectly
correlated. (Correlation value must be +1 or –1).
s********1
发帖数: 235
41
【 以下文字转载自 Database 讨论区 】
发信人: someone111 (some), 信区: Database
标 题: 如何把一个correlation matrix 按照一个table 输入一个 database 里?
发信站: BBS 未名空间站 (Tue Dec 8 16:13:07 2015, 美东)
如何把一个correlation matrix 按照一个table 输入一个 database 里?有大约1000,
000 X 1000, 这么多vectors ,每一对vectors 算一个 correlation, 要输到database
的表里,每一个vector 都有一个常量string 对应,所以不牵扯把vector 输到
database table 里。怎么弄比较好?主要以后可能要拿这些代表 vector 的常量
string 做key join table.
s********1
发帖数: 235
42
【 以下文字转载自 Database 讨论区 】
发信人: someone111 (some), 信区: Database
标 题: 如何把一个correlation matrix 按照一个table 输入一个 database 里?
发信站: BBS 未名空间站 (Tue Dec 8 16:13:07 2015, 美东)
如何把一个correlation matrix 按照一个table 输入一个 database 里?有大约1000,
000 X 1000, 这么多vectors ,每一对vectors 算一个 correlation, 要输到database
的表里,每一个vector 都有一个常量string 对应,所以不牵扯把vector 输到
database table 里。怎么弄比较好?主要以后可能要拿这些代表 vector 的常量
string 做key join table.
b*****e
发帖数: 2511
43
来自主题: WaterWorld版 - 发包子求问,correlation (转载)
【 以下文字转载自 Statistics 讨论区 】
发信人: bechone (绿茶), 信区: Statistics
标 题: 发包子求问,correlation
发信站: BBS 未名空间站 (Fri Sep 7 00:17:55 2012, 美东)
如果A和B非线性相关,他们还有correlation吗?比如说 A= B/(1-B),然后直接测A和B
的correlation, 会不会得出一个很小的值?
另外"r2=0.32,P=ns"里的 P=ns 是啥意思?
a****m
发帖数: 693
44
pearson correlation and Euclidean distance are two common similarity metrics
. correlation coefficient is insensitive to difference in magnitude of the
variables, therefore it is regarded as a shape measurement. Euclidean
distance measure both magnitude and direction of change. It can be shown
that correlation and Euclidean distance are equivalent after standardization
. here standardization means all those samples are normal distributed with
mean 0 and SD 1.
some time, there could be some data ... 阅读全帖
b*****e
发帖数: 2511
45
来自主题: Biology版 - 发包子求问,correlation (转载)
【 以下文字转载自 Statistics 讨论区 】
发信人: bechone (绿茶), 信区: Statistics
标 题: 发包子求问,correlation
发信站: BBS 未名空间站 (Fri Sep 7 00:17:55 2012, 美东)
如果A和B非线性相关,他们还有correlation吗?比如说 A= B/(1-B),然后直接测A和B
的correlation, 会不会得出一个很小的值?
另外"r2=0.32,P=ns"里的 P=ns 是啥意思?
z****f
发帖数: 484
46
First, Bayesian game=incomplete information game.
A correlated eqm is defined as a distribution over the set of strategy profi
les, if under that distribution, every player maximize his payoff against ot
hers, then it is a correlated eqm.
Mixed strategy eqm is a special case of correlated eqm, where the distributi
on can be expressed as the product of players' mixed strategies (distributio
ns over their own strategies).
s********k
发帖数: 6180
47
假设我有两个时间序列x(i),y(i),i=1:n. 一般的correlation计算要等到全部n个采样
完成之后再计算。我现在想用一种online的方法来实时计算,从i=2开始就开始计算两
个序列的相关性,每次时间序列有一个新的采样correlation更新一次,当然由于采样
不完全,所以允许correlation的误差存在,但是希望大体走势应该是越来越趋近最后
的准确值,不知道有没有这样的实时算法存在。谢谢
r******h
发帖数: 656
48
来自主题: Mathematics版 - 请教个CORRELATION COEFFICIENT的问题
【 以下文字转载自 Statistics 讨论区 】
发信人: rayleigh (一塌糊涂※生命的坐标), 信区: Statistics
标 题: 请教个CORRELATION COEFFICIENT的问题
发信站: BBS 未名空间站 (Wed Feb 13 21:37:49 2008), 转信
不知道这个版面合适否。。。
我现在有一组数据,从这组数据我能够得到CORRELATION COEFFICIENT,
这个一般的统计书上都有公式,很简单。
我现在需要得到这个CORRELATION COEFFICIENT estimate的variance,
有没有公式可用?
多谢了先。
q******1
发帖数: 220
49
If the correlation coefficient of X, Y is 0.7 and the correlation
coefficient of Y, Z, then what are the smallest and largest possible value
of the correlation coefficient of X and Z.
多谢解答。
c******s
发帖数: 90
50
Please see this link below.
http://www.noise.cz/sbra/sibram02/2-Ses/Fegan.htm
In this paper, the author present a way to generated correlated uniform R.V.
Another result you may be able to use is:
If X and Y are bivariate-normal with correlation rho,
Let Ux=normcdf(X) and Uy=normcdf(Y), then
Ux and Uy are bivariate-uniform with correlation
(6/pi)*arcsin(rho/2).
1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)