发帖数: 1 | 1 用ensemble training的时候,一般我也很感兴趣feature importance
比如我的model的accuracy是0.85, 同时得到这样的feature importance:
Importance
Feature-1 0.25
Feature-2 0.09
Feature-3 0.08
明显Feature-1是最重要的;所以我就把Feature-1 除去,用剩下的feature计算,
我发现哪怕去除了Feature-1,accuracy还有0.79
我其实不是很懂feature importance的算法,只是觉得比如Feature-1应该贡献了25%的
,所以去掉Feature-1后model accuracy应该大大降低,但为何还是有0.79呢?
是不是因为这就是ensemble的优势?依赖于各种weak feature的集合?所以去掉重要
feature也不那么sensitive? |
|
w*******y 发帖数: 60932 | 2 Harry Potter and the Deathly Hallows, Part 2 was released in theaters on
July 15th, 2011 in 2D & 3D. It earned over $380 million dollars domestically
and over $1.3 billion worldwide, making it the 13th highest grossing movie
ever domestically and the 3rd highest grossing movie worldwide. It is also
the highest grossing movie of 2011. Basically, the movie and the franchise
made a ton of money. That would explain why Warner Brothers CEO Barry Meyer
was kind enough to give JK Rowling a white gold ... 阅读全帖 |
|
w*******y 发帖数: 60932 | 3 Transformers: Dark of the Moon was released in theaters on June 29th, 2011,
earning over $351 million dollars domestically and over $1,123,000,000
BILLION worldwide, the second highest grossing movie of 2011. Featuring a $
195 million production budget, every penny shows on the screen.
When it was released on DVD and Blu-Ray on Sept. 30th 2011, it sold more
than 2,628,787 units, earning over $44 million in sales revenue. For those
of you who saw my New Release Thread on the original release of t... 阅读全帖 |
|
M***e 发帖数: 531 | 4 有个问题是怎么用VIF做feature selection.
有上千个feature,要求是最后的logistic regression model的feature VIF要小于2.
想了解一下大家用VIF做feature selection的过程。
1. 是stepwise的方式,每次去掉有最大VIF的feature,然后重新计算余下feature的
VIF,不断循环直到最后余下的feature VIF<2.
2. 是分几步,先去掉VIF>100的,然后重新计算余下feature的VIF,再去掉VIF>10的,
然后重新计算余下feature的VIF.
因为数据量比较大,1的办法太耗时,所以在用2的办法。但是risk就是在VIF>100去掉
的feature里有本该留下的,结果一刀切都去掉了。
求助,多谢 |
|
M***e 发帖数: 531 | 5 有个问题是怎么用VIF做feature selection.
有上千个feature,要求是最后的logistic regression model的feature VIF要小于2.
想了解一下大家用VIF做feature selection的过程。
1. 是stepwise的方式,每次去掉有最大VIF的feature,然后重新计算余下feature的
VIF,不断循环直到最后余下的feature VIF<2.
2. 是分几步,先去掉VIF>100的,然后重新计算余下feature的VIF,再去掉VIF>10的,
然后重新计算余下feature的VIF.
因为数据量比较大,1的办法太耗时,所以在用2的办法。但是risk就是在VIF>100去掉
的feature里有本该留下的,结果一刀切都去掉了。
求助,多谢 |
|
M***e 发帖数: 531 | 6 有个问题是怎么用VIF做feature selection.
有上千个feature,要求是最后的logistic regression model的feature VIF要小于2.
想了解一下大家用VIF做feature selection的过程。
1. 是stepwise的方式,每次去掉有最大VIF的feature,然后重新计算余下feature的
VIF,不断循环直到最后余下的feature VIF<2.
2. 是分几步,先去掉VIF>100的,然后重新计算余下feature的VIF,再去掉VIF>10的,
然后重新计算余下feature的VIF.
因为数据量比较大,1的办法太耗时,所以在用2的办法。但是risk就是在VIF>100去掉
的feature里有本该留下的,结果一刀切都去掉了。
一般大家都怎么做的呢?
求助,多谢 |
|
o**********e 发帖数: 18403 | 7 When we argue about the t-shirts to wear;
When we fight about the flag to bear;
Whether to rally,to boycott or to sue, we differ;
When we laugh at others for their culture,
their accent or their skin colors;
I want to remember:
Diversity is Not a Bug, it's a Feature.
Laughter is Not a Bug, it's a Feature.
Love is Not a Bug, it's a Feature.
Forgiveness is Not a Bug, it's a Feature.
Peace is Not a Bug, it's a Feature.
Win-Win is Not a Bug, it's a Feature.
Understanding is Not a Bug, it's a Feature... 阅读全帖 |
|
z*******n 发帖数: 1034 | 8 Oracle Announces First Java 9 Features
by Ben Evans on Aug 18, 2014 | Discuss
Oracle has announced the first set of enhancement proposals (known as JEPs)
that will deliver features for Java 9.
Java Enhancement Proposals are a new process that allow features for the
Java language and virtual machine to be developed and explored without
requiring a full specification process (JSR). This means that the scope of
JEPs can be smaller and more targeted, and can also tackle issues that are
specific to t... 阅读全帖 |
|
w*******y 发帖数: 60932 | 9 Avatar, the highest grossing movie of all time, was re-released in theaters
on Aug. 27th, 2010, with over six minutes of additional footage added to
create a special edition. As of this posting, the re-release of the movie
made an extra $10 million plus to push its overall domestic gross to over $
760 million dollars and almost $2.8 BILLION dollars overall worldwide! The
re-release was to approximately 750 3D and IMAX 3D locations only, where it
made its extra cash. Director James Cameron has sa... 阅读全帖 |
|
w*******y 发帖数: 60932 | 10 Harry Potter and the Deathly Hallows, Part 1 was released in theaters on Nov
. 19th, 2010, earned over $295 million dollars domestically and over $952
million worldwide, the 38th highest grossing movie ever! Director David
Yates states that this book had so much material in it that could not be cut
or condensed; it had to be split into two parts, the second of which will
be hitting theaters on July 15th, 2011, in 3D!
A recently released featurette showing the making of part two has people
buzzin... 阅读全帖 |
|
w*******y 发帖数: 60932 | 11 i needed a laptop for "getting me by" while im at training for my current (
new) position (training lasts 3 months) and the best deal i've been able to
find on a solid laptop that will do what i need is this one at frys:
Frys Fujitsu AH531-FPCR34121 15.6" Notebook Featuring 2nd Generation Intel
Core i5 Processor:
http://www.frys.com/product/6521903
Detailed Description
(Manufacturer # FPCR34121 )
Fujtisu LifebookAH531 notebook (FPCR34121)
The new second generation IntelCore i5 processor, del... 阅读全帖 |
|
s******m 发帖数: 2310 | 12 就是说 只要不是featured 就都归为new
那么成为feature是看从一开始累积的总量 还是一定时间内累积的量?
达到一定销量就可以成为feature
New里面是feature的在上面(Featured Merchants),不是feature的在下面(new). |
|
F*********e 发帖数: 385 | 13 现在很多找房的同学们都会用redfin找房,因为它的用户界面做得挺不错的,可以快速
查看房子的地理位置和房子信息。虽然我是房地产经济,使用mls系统比较多,
但是redfin的用户界面更加user friendly,所以除了mls以外这个网站是我的最爱之一。
话说去年redfin改版过一次,把他最吸引人的可以从地图上先选房子位置,同时
查看房源信息的feature remove了,说是方便手机用户,要知道这个feature我个人
认为是redfin 的立业之本,是原来他能够区别于其他各类网站的最好的feature,
房地产最重要的就是“location,location, location"啊。大家都非常激动的反对,
发誓如果不变回来就要 abandon redfin,改用zillow,我也去摇旗呐喊来着,:P.
群众的力量是巨大的,过了一阵子总算改回来了。相对于其他房源网站,
redfin更新房源的速度也很快,唯一能比他快的就是mls了。顺便提一句大家慎重
用zillow查找新房源,信息更新太慢,但是用来查找历史成交记录还不错。
但是redfin也有靠不住的时候, 比如搜索有bug,... 阅读全帖 |
|
w********1 发帖数: 3492 | 14 Tue, 12 Jun 2012 16:19:32 PDT
Apple claims that iOS 6, the next version of its iPhone and iPad operating
system, is "compatible" with devices as old as the iPhone 3GS. The 3GS was
originally released nearly three years ago in June 2009 -- an eternity in
gadget time.
However, at the bottom of Apple's iOS 6 info page lies a small disclaimer: "
Not all features are available on all devices." This is followed by 8
footnotes detailing exactly what features work on what device.
Some features, like the... 阅读全帖 |
|
|
p******s 发帖数: 76 | 16 I dont want to by a Jap car, but I have no good choices in minivans.
I went to a Toyota dealer in NJ.
Sienna LE has a lot of features we like. backup camera, space, dual climte
control (not sure how practical), roof rack (again not sure when it will
ever used). it doesnt have power trunk lid, which Id to have.
it has 2 year free care.
8 pass with remote 28186 invoice including destination + 500 + tax,
registration. if we put 2 car seats in center row, we have to take the 8th
seat out. I am ... 阅读全帖 |
|
M*****n 发帖数: 979 | 17 二手交易风险自负!请自行验证是否合法和一手卡!:
Y
我想卖的物品:
H2O feature card $10x2 at $8 each
单张面值:
$10
可接受价格(必须明码标价!):
$8
物品新旧要求:
new
邮寄方式要求:
email
买卖双方谁承担邮寄损失(Required if not code only):
付款方式说明:
non-cc paypal, chase, boa
其他补充说明:
A feature card is an add-on that you can purchase to complement your
existing monthly plan.
What you need to know about our Feature Card:
You must have a valid H2O Wireless number that is registered under a Monthly
Unlimited Plan or a $25 Pay-As-You-Go Plan with a positive account balance.
Che... 阅读全帖 |
|
M*****n 发帖数: 979 | 18 二手交易风险自负!请自行验证是否合法和一手卡!:
Y
我想卖的物品:
H2O feature card $10x2 at $8 each
单张面值:
$10
可接受价格(必须明码标价!):
$8
物品新旧要求:
new
邮寄方式要求:
email
买卖双方谁承担邮寄损失(Required if not code only):
付款方式说明:
non-cc paypal, chase, boa
其他补充说明:
A feature card is an add-on that you can purchase to complement your
existing monthly plan.
What you need to know about our Feature Card:
You must have a valid H2O Wireless number that is registered under a Monthly
Unlimited Plan or a $25 Pay-As-You-Go Plan with a positive account balance.
Che... 阅读全帖 |
|
q*****i 发帖数: 30 | 19 【 以下文字转载自 CS 讨论区 】
发信人: qiangyi (haha), 信区: CS
标 题: 请教几个image feature detection的问题
发信站: BBS 未名空间站 (Sun Oct 12 05:45:40 2008)
刚开始做一个与图像处理有关的project,想请教几个问题
1. 我现在用的是oxford 的affine covariance detector,想知道一下当每一个affine
region 取出来以后,它是怎么计算sift feature的。是不是基于每一个region的中心的
scale & orientation 来计算?
2. 这个软件怎么计算一个指定区域的sift feature,我看它的input 参数里面有个-p
什么的,但是基于什么原理来计算我还是没有查到。
3.关于这个软件谁有专门的help manual? 比如我在找region feature的时候,朋友告
诉我-noangle可以使得每个region 只有一个sift feature (-p的情况下)。 但是如果
我直接用hasaff or hesaff找的时候, |
|
w********1 发帖数: 3492 | 20 Mon, 11 Jun 2012 11:56:05 PDT
At today's WWDC keynote event, Apple announced a release timeframe and
pricing for Mountain Lion, the next version of the Mac operating system
which was previewed earlier this year. Executives also gave a preview of
some new features including dictation, iCloud Tabs, and Power Nap. Mountain
Lion will be released in July through the Mac App Store for $19.99 and
all Macs purchased starting today can receive an upgrade for free.
The new dictation feature will be in... 阅读全帖 |
|
q*****i 发帖数: 30 | 21 刚开始做一个与图像处理有关的project,想请教几个问题
1. 我现在用的是oxford 的affine covariance detector,想知道一下当每一个affine
region 取出来以后,它是怎么计算sift feature的。是不是基于每一个region的中心的
scale & orientation 来计算?
2. 这个软件怎么计算一个指定区域的sift feature,我看它的input 参数里面有个-p
什么的,但是基于什么原理来计算我还是没有查到。
3.关于这个软件谁有专门的help manual? 比如我在找region feature的时候,朋友告
诉我-noangle可以使得每个region 只有一个sift feature (-p的情况下)。 但是如果
我直接用hasaff or hesaff找的时候,这个-noangle 没有用。每个点可能有多个sift
feature。
thx in advance。 |
|
w*******y 发帖数: 60932 | 22 X-Men: First Class was released in theaters on June 3rd, 2011, earning over
$146 million dollars domestically and over $352 million worldwide, the 12th
highest grossing movie (so far) in 2011. Matthew Vaughn, director of the
comic book movie Kick-Ass, creates a prequel to the previous four X-Men
films, while it also could be considered a reboot of the whole X-Men movie
franchise after the previous movies X-Men: The Last Stand and X-Men Origins:
Wolverine. However, it still tries to retain contin... 阅读全帖 |
|
w*******y 发帖数: 60932 | 23 Transformers: Dark of the Moon was released in theaters on June 29th, 2011,
earning over $351 million dollars domestically and over $1,118,000,000
BILLION worldwide, the second highest grossing movie (so far) in 2011.
Featuring a $195 million production budget, every penny shows on the screen.
Michael Bay, who needs no introduction here, said that this is the final
Transformers movie in the trilogy of films he plans on directing. Ditto for
Shia LaBeouf, who said this is his last Transformers mov... 阅读全帖 |
|
w*******e 发帖数: 285 | 24 zz:http://joe.mehaffey.com/gpshiking.htm
revised 19 June 2002
What features are important for a particular use are a very personal thing.
And.. The features needed for hiking are a bit more extensive (and a bit d
ifferent) from those needed for automobile navigation use. Below are my "es
sential hiking feature list" of GPS receiver features.
(I omitted features that are present in ALL receivers.)
1) 12 channel parallel receiver system: Needed for best reception in diffic
ult terrain and tree |
|
l********o 发帖数: 5629 | 25 要是对manager和普通员工做classification, 就会发现manage的关键feature有三个:1
.口才好(imply 英语好); 2. 工作卖力(特别是在一些非技术性的琐事上);3.喜欢
social.
特别要指出的是,做牛做马的索南们最看重的技术能力并不是classify manager与普通
员工的feature. 有些manager技术还可以,也有很多manager技术水平和他所领导的手
下差不多。
想做manager就努力去给自己增加上面三个feature就可以了。事实上阿三天天就忙着在
这三个feature上努力。 |
|
e***l 发帖数: 3482 | 26 我用Featured First快一年了,现在每个月光这个就交400来块给EBAY。
今天无意中查账单,发现4月10号收了我74.95, 4月12日又收了我74.95,都是同一个
LISTING。
我的LISTING是GOOD TIL CANCEL,12日是RELIST的日子。
我之前一直认为Feature First是可以随时添加,然后PRORATE到下次RELIST的日子。
现在看来,EBAY这个孙子就是收74.95,不管剩几天。
我估计用了这个FEATURE FIRST的,没几个人注意到这一点。换句话说,每个使用
FEATURE FIRST的LISTING,平均损失在74.95/2=$37.5。用的越多,被EBAY黑的越多。
纳闷,怎么没人告丫EBAY,这绝对是任何POLICY里都没提到过的。绝对可以来个class
action了。 |
|
l*******r 发帖数: 4028 | 27 达到一定销量就可以成为feature
New里面是feature的在上面(Featured Merchants),不是feature的在下面(new). |
|
w*********p 发帖数: 7230 | 28 这个featured seller是对一个product说还是对一个category说?如果是category,那
是怎么算的呢?
比如我卖喷墨打印机A,喷墨打印机B,激光打印机C,电脑D,耳机E
如果喷墨打印机A在featured seller里,那么喷墨打印机B也会是featured么,
激光打印机C呢?电脑C耳机D呢?
另外怎么看是不是featured seller? |
|
b**********r 发帖数: 3861 | 29 Features
A flexible premium and adjustable equity indexed life insurance policy is a
universal life insurance policy with an equity-indexing feature. The
insurance company accepts your premium payment and deposits the premium into
your cash value account, which is set up with the policy. Then, the
insurance company deducts the cost of insurance from the cash value account
and invests the premiums into long-term bonds. The interest generated from
the bonds is then invested into index call options... 阅读全帖 |
|
y*****7 发帖数: 1555 | 30 1. 一般来说bug要比feature的priority高。因为,一般bug是别人测出来的,至少这个
bug block了其他的进度。所以,一般有bug,至少先看一下,初步判断一下root cause
。可以的话,estimate一下。如果时间花的比较多,就需要involve release相关的人
了,PM,RM,dev lead, tester,tester leader。至少大家讨论一下这个bug的严重
程度,能不能先release以后修。如果不行,那就只能停下feature work,修bug。如果
可以,就不管,以后来修
2. feature work其实也必要太紧张时间。晚了一天就不行了,公司就倒闭了。如果真
的是这样,还是赶快走人吧。想想新的feature为什么是新的,就是原来没有啊。过去
都没有了这么长时间了,多等个几天也不会有什么问题。
关键的关键是让你的manager\boss 知道,你在干什么,需要多少的resource。 |
|
R******d 发帖数: 1436 | 31 看到一些文章使用遗传算法寻找feature集合,把这个集合用于machine learning
predictor,比如svm,神经网络等等。反复多次最终得到最优的集合,获得最大的
prediction accuracy.
遗传算法中包括crossover, duplication, mutation的步骤。我的不解是:如果在一个
instance的feature上发生mutation,或者在两个instance的feature 之间发生
crossover,很有可能会产生一些在原始dataset中根本就不存在的instance。那么,怎
么能够拿这些根本不存在的 instance去做training呢?起码,这些新的instance连应
该属于哪个lable都不知道吧。
或者我理解错了,遗传算法是通过优化instance而不是优化feature来得到最大
prediction accuracy?
第二个问题是,好像遗传算法分成了好多种,我看到了:
GA (genetic algorithm) GP(genetic programming) GE (grammatical evol |
|
发帖数: 1 | 32 binary和传统的dummy code都不适合决策树类型的分类器,因为把寻找最优分岔的问题
限制在one vs all这个局限里面了。但是在实际应用中效果还是不错的。有文章说在
cardinality很大的情况下,直接把每个level按照出现概率rank一下然后就哪rank取代
之变成numerical feature也一样work。实际上真正严格遵守最悠久或者近似最优解来
做决策树分叉的算法在high cardin的情况下非常慢,这也是为啥r的随机森林只支持
cardinality小于53的categorical feature。Python里的版本直接就是用的rank来替代
的。
xgboost只支持数值变量,把这个问题丢给了用户去自己找合适的编码方案。
word2vec是根据在意过语料集里单词上下文co-occurrance的频率来学习一个单次的分
布式表达或者向量表达,这个表达的euclidean distance保留一些语义上和语法上的相
似度或者可替代度。这个概念也可以用在其他有co-occurrance的feature上,不仅限于
NLP。用这个方法做feature转换... 阅读全帖 |
|
j***3 发帖数: 142 | 33 【 以下文字转载自 DataSciences 讨论区 】
发信人: j1123 (2134), 信区: DataSciences
标 题: 有关feature selection的问题求助
关键字: features selection
发信站: BBS 未名空间站 (Thu Aug 6 16:57:24 2015, 美东)
Each element from a collection have n different features V 1..n. and the
elements in the collection have a known distance matrix Mknown
Distance between two elements can be also calculated as ∑|dV| and the
resulting distance matrix as Mcalu
how to select/weight features to maximize the correlation between Mknown and
Mcalu |
|
c******n 发帖数: 2439 | 34 经常受到email要求签名的。不知道怎么做,发信问了下。以后我们也可以用这个方法
。请看下面email的回答。
Hello,
I have a question about the promotion for a petition? Occasionally, I
receive emails asking me to sign some petitions. Is that a kind of petition?
If it is, are there any guideline I can follow?
Thanks, chenchen
Hello chenchen,
Thank you for contacting us. Those petitions are featured petitions. “
Featured” petitions are chosen by our organizers. Choices are made based on
several factors, including the momentum of a petition (i.e. — ... 阅读全帖 |
|
l*******m 发帖数: 1096 | 35 Lasso, for instance, selects features for dense features. However, it
selects variables for sparse features (eg categorical features) |
|
l*******m 发帖数: 1096 | 36 Lasso, for instance, selects features for dense features. However, it
selects variables for sparse features (eg categorical features) |
|
t*****e 发帖数: 364 | 37 feature selection 一般分两类: filtering based and wrapper/embedded based.
forward stepwise 对5万个features 因为计算时间就可以淘汰了,lasso 为什么不行
?面试官说不行是因为计算时间还是因为选出来的feature predictive performance
差? R 里面的glmnet package 用坐标下降,50k feature 应该挺快的。至于
predictive performance更没有绝对的了, 都是dataset dependent. 如果面试官懂的
话,他应该听说过no free lunch theorem. 也许他希望你说用filtering based
methods like correlation, mutual info, etc? |
|
发帖数: 1 | 38 大概300个feature
恩。你说的很对,有一些跟feature-1相关的其他feature就“顶上来”替代了
不过,既然这个feature-1占了0.3这么大比例的importance,为何去掉后,百分比之降
低了6%? |
|
w*******y 发帖数: 60932 | 39 Yes, It has HDMI
Link:
http://www.frys.com/product/6280330?site=sr:SEARCH:MAIN_RSLT_PG
Fujitsu Life BookAH530 notebook
PN: FPCR33681
The latest IntelCore processor provides all the power needed for your
favorite app s. When coupled with its multimedia features and sleek design
the Fujitsu AH530 notebook is an excellent choice.
OVERVIEW:
Powered by the latest Intel Core i3 Processor
Offered with Genuine Windows7 Home Premium operating system
Fast and long range 802.11 B/G/N wireless connectivit ... 阅读全帖 |
|
w*******y 发帖数: 60932 | 40 Link:
http://www.buy.com/prod/band-hero-super-bundle-featuring-taylor-swift/q/loc/108/211328319.html
Brought to you by the makers of Guitar Hero, one of the best-selling video
game franchises of all time, Band Hero features the hottest chart-topping
hits from everyones favorite acts including Taylor Swift, No Doubt, Lily
Allen, The All-American Rejects and Jackson 5. Headlined by some of the
biggest names in music as in-game artists and playable characters, Band Hero
is the ultimate party game t... 阅读全帖 |
|
|
w*******y 发帖数: 60932 | 42 Office Tabs Free Edition
Brings Tabbed Editing, Browsing and Managing User Interface to Microsoft
Office 2010, 2007 and 2003. Office Tabs is fully free for personal users. No
time limitation, No features limitation. The software includes three
components: Office Tabs for Word, Office Tabs for Excel and Office Tabs for
PowerPoint. It is fully free for home use (personal non-commercial use).
Includes these functions - Customize Tab Length, Customize Tab Appearance,
Shortcuts Support, & Show / Hide... 阅读全帖 |
|
D****9 发帖数: 10889 | 43 Dear DVD209
We know you've purchased the Featured First listing upgrade recently, and
wanted to let you know this feature will be discontinued in June and July of
this year.
Our reason for taking this step is simply to make sure we always highlight
the most relevant listings from sellers providing great value and service at
the top of eBay search results.
You do not need to change your listings-Featured First will no longer be
available for purchase with a listing as of June 30, 2010 and will
au |
|
c****o 发帖数: 4716 | 44 几周前用了一个7天的feature,结果出货量大增,东西就超卖了。价格涨了一倍一天也
能出1,2个。想等货囤多些再列时,发现ebay取消了feature。不知道后feature时代,
神医们靠什么提高销量! |
|
b***n 发帖数: 9346 | 45 真不知道他们那个featured怎么算的,有一类东西我从来没卖过,竟然也算featured,
我卖的最多的一类东西反而不是featured.... |
|