求助一多分类logistic回归归问题

solidworks | PHP | c4d | 细胞生物学 | HTML | 冬奥会 | 基因 | 营销策划 | 扫地机器人 | 武侠 | 大学生就业 | 电学 | 国航 | 电子技术研发 | 几何学 | 外星人 | 语言学 | 秦时明月之天行九歌 | 金融数学 | 三国人物 | 休学 | 小店区 | 杨紫 | 植保无人机 | CSS | 陶渊明 | 少数民族 | AutoCAD | 3d打印机 | 香港购物 | 日语语法 | 对联 | matlab | 按键精灵 | 粉丝（Fans） | 语言学习 | 总决赛 | 驾驶经验 | Spss数据分析 | 日本漫画 | 数学建模 | 道德 | 项目管理 | 背景音乐（bgm） | 云主机 | 3D Max | onenote | 游戏原画 | 科学 | 网站建设 | 热血传奇（游戏） | 身高 | 网站运营 | 道教 | 社会学 | 迅雷（软件） | 爬虫（计算机网络） | O2O | 运载火箭 | 遗传学 | 率土之滨 | 百度输入法 | 极限挑战(综艺节目) | 电梯 | 女性主义 | Adobe After Effects | mysql | 办公软件 | 法国 | ps3 | 化学实验 | QQ群 | 中国中央电视台 | 前女友 | 性格 | 免费软件 | 分子生物学 | 金庸小说 | 留学生 | Microsoft SQL Server | 龙珠 | 设计院 | C#编程 | 虚拟机 | 字幕 | 微信群 | 创业项目 | 祛痘 | 图形处理器（gpu） | Microsoft Visual Studio | 动物保护 | C/C++ | facebook | 秦岭 | 燕窝 | 人性 | 下载 | 驾驶技术 | 大学数学 | 封神演义 | 整容 | 西装 | 马克思主义哲学 | 计算机专业 | pdf | thinkpad | 代理 | 参考文献 | 江苏大学 | 游戏手柄 | 城市规划 | 黑洞 | 旅行 | CAD制图 | 风水 | 直播 | 快捷键 | 编辑器 | 机器学习 | 暴走大事件 | 球球大作战 | unity（游戏引擎） | 永恒之塔 | DJI大疆创新 | 传统文化 | wordpress | 仙剑奇侠传（游戏） | 国际物流 | 安徽 | 配音 | 猎头公司 | 在线教育 | 欧洲冠军联赛 | ios游戏 | 洛奇英雄传 | 暗恋 | 网盘 | 星座爱情 | 剧场版 | 面相 | 讯飞输入法 | 记忆力 | 超级战队 | stm32 | 亚马逊中国 | Apple ID | 服装设计 | 网络主播 | 品牌营销 | 情侣 | 新加坡 | 调酒 | 雷欧奥特曼 | 花样姐姐 | 物联网 | 任天堂3ds | 易经 | 户型 | 流氓软件 | 圣经 | 进化 | 垃圾分类 | 函数 | 星际穿越（电影） | 山东工艺美术学院 | 优酷视频 | github | 舰队 Collection | 流行音乐 | 进击的巨人 | playstation vita | 科学研究 | 欢乐麻将 | 史莱姆 | 海关 | Internet Explorer | 刑事案件 | 取名 | 江苏银行 | eDonkey网络 | 表情包 | mfc | 大学军训 | 诸葛亮 | Apple WATCH | 嵌入式系统 | 私募证券投资基金 | iOS应用 | 对外经贸大学 | 最强大脑（电视节目） | 青蛙 | 日本代购 | 巧克力 | 天涯明月刀ol（游戏） | 食用油 | 曹操 | SEO | 生命 | 乌贼 | 我的英雄学院 |

你的位置：网站首页 >> 频道首页 >>软件 >>求助一多分类logistic回归归问题

求助一多分类logistic回归归问题

来源：蜘蛛抓取(WebSpider) 时间：2017-09-29 04:58 标签： logistic回归模型

Logistic&regression&（逻辑回归）&概述
Logistic regression （逻辑回归）概述
Logistic&regression&（逻辑回归）是当前业界比较常用的机器学习方法，用于估计某种事物的可能性。比如某用户购买某商品的可能性，某病人患有某种疾病的可能性，以及某广告被用户点击的可能性等。（注意这里是：“可能性”，而非数学上的“概率”，logisitc回归的结果并非数学定义中的概率值，不可以直接当做概率值来用。该结果往往用于和其他特征值加权求和，而非直接相乘）
&&那么它究竟是什么样的一个东西，又有哪些适用情况和不适用情况呢？
&&一、官方定义：
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://hiphotos.baidu.com/hehehehello/pic/item/b81c5cbd3ca76.jpg"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />，
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://hiphotos.baidu.com/hehehehello/pic/item/70cf02fddd476.jpg"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://bits.wikimedia.org/skins-1.17/common/images/magnify-clip.png" WIDTH="15" HEIGHT="11"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />&&Figure&1.&The&logistic&function,&with&zon&the&horizontal&axis&and&&(z)&on&the&vertical&axis
&&&逻辑回归是一个学习f:X&&&&Y&方程或者P(Y|X)的方法，这里Y是离散取值的，X=&&&X1,X2...,Xn&&&是任意一个向量其中每个变量离散或者连续取值。
&&二、我的解释
&&只看公式太痛苦了，分开说一下就好。Logistic&Regression&有三个主要组成部分：回归、线性回归、Logsitic方程。
&&&Logistic&regression是线性回归的一种，线性回归是一种回归。那么回归是虾米呢？
&&&回归其实就是对已知公式的未知参数进行估计。比如已知公式是y&=&a*x&+&b，未知参数是a和b。我们现在有很多真实的(x,y)数据（训练样本），回归就是利用这些数据对a和b的取值去自动估计。估计的方法大家可以简单的理解为，在给定训练样本点和已知的公式后，对于一个或多个未知参数，机器会自动枚举参数的所有可能取值（对于多个参数要枚举它们的不同组合），直到找到那个最符合样本点分布的参数（或参数组合）。（当然，实际运算有一些优化算法，肯定不会去枚举的）
&&&&注意，回归的前提是公式已知，否则回归无法进行。而现实生活中哪里有已知的公式啊（G=m*g&也是牛顿被苹果砸了脑袋之后碰巧想出来的不是？哈哈），因此回归中的公式基本都是数据分析人员通过看大量数据后猜测的（其实大多数是拍脑袋想出来的，嗯...）。根据这些公式的不同，回归分为线性回归和非线性回归。线性回归中公式都是“一次”的（一元一次方程，二元一次方程...），而非线性则可以有各种形式（N元N次方程，log方程&等等）。具体的例子在线性回归中介绍吧。
&&2）线性回归
&&直接来一个最简单的一元变量的例子：假设要找一个y和x之间的规律，其中x是鞋子价钱，y是鞋子的销售量。（为什么要找这个规律呢？这样的话可以帮助定价来赚更多的钱嘛，小学的应用题经常做的呵呵）。已知一些往年的销售数据（x0,y0),&(x1,&y1),&...&(xn,&yn)做样本集,&&并假设它们满足线性关系：y&=&a*x&+&b&（其中a,b的具体取值还不确定），线性回归即根据往年数据找出最佳的a,&b取值，使&y&=&a&*&x&+&b&在所有样本集上误差最小。&
&&&也许你会觉得---晕！这么简单!&这需要哪门子的回归呀！我自己在草纸上画个xy坐标系，点几个点就能画出来！（好吧，我承认我们初中时都被这样的画图题折磨过）。事实上一元变量的确很直观，但如果是多元就难以直观的看出来了。比如说除了鞋子的价格外，鞋子的质量，广告的投入，店铺所在街区的人流量都会影响销量，我们想得到这样的公式：sell&=&a*x&+&b*y&+&c*z&+&d*zz&+&e。这个时候画图就画不出来了，规律也十分难找，那么交给线性回归去做就好。（线性回归具体是怎么做的请参考相应文献，都是一些数学公式，对程序员来说，我们就把它当成一条程序命令就好）。这就是线性回归算法的价值。
&&&需要注意的是，这里线性回归能过获得好效果的前提是y&=&a*x&+&b&至少从总体上是有道理的（因为我们认为鞋子越贵，卖的数量越少，越便宜卖的越多。另外鞋子质量、广告投入、客流量等都有类似规律）；但并不是所有类型的变量都适合用线性回归，比如说x不是鞋子的价格，而是鞋子的尺码），那么无论回归出什么样的（a,b），错误率都会极高（因为事实上尺码太大或尺码太小都会减少销量）。总之：如果我们的公式假设是错的，任何回归都得不到好结果。
&&3）Logistic方程
&&上面我们的sell是一个具体的实数值，然而很多情况下，我们需要回归产生一个类似概率值的0~1之间的数值（比如某一双鞋子今天能否卖出去？或者某一个广告能否被用户点击?&我们希望得到这个数值来帮助决策鞋子上不上架，以及广告展不展示）。这个数值必须是0~1之间，但sell显然不满足这个区间要求。于是引入了Logistic方程，来做归一化。这里再次说明，该数值并不是数学中定义的概率值。那么既然得到的并不是概率值，为什么我们还要费这个劲把数值归一化为0~1之间呢？归一化的好处在于数值具备可比性和收敛的边界，这样当你在其上继续运算时（比如你不仅仅是关心鞋子的销量，而是要对鞋子卖出的可能、当地治安情况、当地运输成本&等多个要素之间加权求和，用综合的加和结果决策是否在此地开鞋店时），归一化能够保证此次得到的结果不会因为边界&太大/太小&导致&覆盖其他feature&或&被其他feature覆盖。（举个极端的例子，如果鞋子销量最低为100，但最好时能卖无限多个，而当地治安状况是用0~1之间的数值表述的，如果两者直接求和治安状况就完全被忽略了）这是用logistic回归而非直接线性回归的主要原因。到了这里，也许你已经开始意识到，没错，Logistic&Regression&就是一个被logistic方程归一化后的线性回归，仅此而已。
&&&至于所以用logistic而不用其它，是因为这种归一化的方法往往比较合理（人家都说自己叫logistic了嘛&呵呵），能够打压过大和过小的结果（往往是噪音），以保证主流的结果不至于被忽视。具体的公式及图形见本文的一、官方定义部分。其中f(X)就是我们上面例子中的sell的实数值了，而y就是得到的0~1之间的卖出可能性数值了。（本段&“可能性”&并非&“概率”&，感谢同学在回复中指出）
三、Logistic&Regression的适用性
1）&可用于概率预测，也可用于分类。
&&&&&&&并不是所有的机器学习方法都可以做可能性概率预测（比如SVM就不行，它只能得到1或者-1）。可能性预测的好处是结果又可比性：比如我们得到不同广告被点击的可能性后，就可以展现点击可能性最大的N个。这样以来，哪怕得到的可能性都很高，或者可能性都很低，我们都能取最优的topN。当用于分类问题时，仅需要设定一个阈值即可，可能性高于阈值是一类，低于阈值是另一类。
2）&仅能用于线性问题
&&&&&&&只有在feature和target是线性关系时，才能用Logistic&Regression（不像SVM那样可以应对非线性问题）。这有两点指导意义，一方面当预先知道模型非线性时，果断不使用Logistic&Regression；&另一方面，在使用Logistic&Regression时注意选择和target呈线性关系的feature。
3）&各feature之间不需要满足条件独立假设，但各个feature的贡献是独立计算的。
&&&&&&&逻辑回归不像朴素贝叶斯一样需要满足条件独立假设（因为它没有求后验概率）。但每个feature的贡献是独立计算的，即LR是不会自动帮你combine&不同的features产生新feature的&(时刻不能抱有这种幻想，那是决策树,LSA,&pLSA,&LDA或者你自己要干的事情)。举个例子，如果你需要TF*IDF这样的feature，就必须明确的给出来，若仅仅分别给出两维&TF&和&IDF&是不够的，那样只会得到类似&a*TF&+&b*IDF&的结果，而不会有&c*TF*IDF&的效果。
第一个matlab程序 Logistic
Regression
如果预测值只能是0或者1，线性回归不是一个好的办法，线性回归不能把输出值限制在区间（0,1）。
那么可以做一个logistic变换，使得变换之后的输出值区间限制在（0,1）。
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://f.hiphotos.baidu.com/space/pic/item/dcd1094eea342dd158ccbf6c814d20.jpg" WIDTH="410" HEIGHT="100"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
是一个关于（0,0.5）对称的奇函数。
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://h.hiphotos.baidu.com/space/pic/item/de9c822e3ba18d843d7.jpg" WIDTH="393" HEIGHT="70"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://e.hiphotos.baidu.com/space/pic/item/b21bb051f21b4aed2e738ad4e6de.jpg" WIDTH="393" HEIGHT="79"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://b.hiphotos.baidu.com/space/pic/item/6a63fc510fd8f9a1d9.jpg" WIDTH="435" HEIGHT="68"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
求其似然函数：
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://c.hiphotos.baidu.com/space/pic/item/a1ec08fa513dc255fbb2fb4216d8fb.jpg" WIDTH="542" HEIGHT="200"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
log似然函数：
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://e.hiphotos.baidu.com/space/pic/item/d833c895d143ad4b95b4db5f82025aafa50f0692.jpg" WIDTH="572" HEIGHT="124"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
最大似然要使其log似然函数值最大，用梯度下降法求取最大值时的参数。
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://d.hiphotos.baidu.com/space/pic/item/564e69b679a6b3de9c82d0584f5a.jpg" WIDTH="758" HEIGHT="208"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
最终迭代更新参数的公式为：
<img src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://g.hiphotos.baidu.com/space/pic/item/b7fd9dfa3f84d40735fae7cd3455.jpg" WIDTH="360" HEIGHT="66"
ALT="Logistic&regression&（逻辑回归）&概述"
TITLE="Logistic&regression&（逻辑回归）&概述" />
在matlab上简单实现了下，主要是为了熟悉matlab的语法及函数。
&文件Logistic_Regression.m，&其中的内容为：
function&[theta]=Logistic_Regression&(X,Y,alpha)
xSize&=&size(X);
xRowSize&=&xSize(:,1);
xColSize&=&xSize(:,2);
�d&one&column&which&is&all&ones&to&the&first&cloumn&of&X,
%this&is&for&theta(0).
onesColum&=&ones(xRowSize,1);
X=[onesColum,X];
ySize&=&size(Y);
yRowSize&=&ySize(:,1);
yColSize&=&ySize(:,2);
%check&parameters
if&yColSize~=1
&&&&error('The&sencode&parameter&should&contain&only&one&column.');
if&xRowSize~=yRowSize
&&&&error('Matrix&dimensions&not&agree,X&should&has&the&same&number&of&rows&as&Y.');
%initialize&theta
thetaSize&=&xColSize+1;
theta&=&zeros(thetaSize,1);
esp&&=&0.0001;
maxIter&=&1000;
while&loss&esp&&&&iter&&&&
&&&&%hypotheis(X;theta)&=&1/1+exp(-X*theat);
&&&&hypothesis&=&-X*
&&&&for&i=1:1:yRowSize
&&&&&&&&hypothesis(i)=1/(1+exp(hypothesis(i)));
&&&&loss&=&0;
&&&&for&i=1:1:thetaSize
&&&&&&&&update=(hypothesis&-&Y)'*X(:,i).*
&&&&&&&&loss&=&loss&+&abs(update);
&&&&&&&&theta(i)=&theta(i)-
&&&&iter=iter+1;
display(sprintf('iter&%d\tloss：%6.5f\n',iter,loss));
&在matlab命令行窗口中输入：
&&&X&=&[0.0&0.1&0.7&1.0&1.1&1.3&1.4&1.7&2.1&2.2]';
&&&Y&=&[0&0&1&0&0&0&1&1&1&1]';
&&&B=Logistic_Regression(X,Y,0.5)
iter&117&loss：0.00010
&&&-3.4922
&&&&2.9395
&用matlab系统中函数测试：
&&&C&=&glmfit(X,&[Y&ones(10,1)],&'binomial',&'link',&'logit')
&&&-3.4932
&&&&2.9402
可以看出来B和C的值接近。
本栏目（Machine learning）包括单参数的线性回归、多参数的线性回归、Octave
Tutorial、Logistic
Regression、Regularization、神经网络、机器学习系统设计、SVM（Support Vector Machines
支持向量机）、聚类、降维、异常检测、大规模机器学习等章节。所有内容均来自Standford公开课machine
learning中Andrew老师的讲解。（）
第三讲-------Logistic
Regression & Regularization
本讲内容：
Regression
=========================
(一)、Classification
（二）、Hypothesis
Representation
（三）、Decision
（四）、Cost
（五）、Simplified
Cost Function and Gradient Descent
（六）、Parameter
Optimization in Matlab
（七）、Multiclass
classification : One-vs-all
The problem of
overfitting and how to solve
=========================
（八）、The problem
of overfitting
（九）、Cost
（十）、Regularized
Linear Regression
（十一）、Regularized
Logistic Regression
本章主要讲述逻辑回归和Regularization解决过拟合的问题，非常非常重要，是机器学习中非常常用的回归工具，下面分别进行两部分的讲解。
第一部分：Logistic
Regression
假设随Tumor
Size变化，预测病人的肿瘤是恶性（malignant）还是良性（benign）的情况。
给出8个数据如下：
&<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_3280.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
假设进行linear
regression得到的hypothesis线性方程如上图中粉线所示，则可以确定一个threshold:0.5进行predict
即malignant=0.5的点投影下来，其右边的点预测y=1;左边预测y=0；则能够很好地进行分类。
那么，如果数据集是这样的呢？
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_9129.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
这种情况下，假设linear
regression预测为蓝线，那么由0.5的boundary得到的线性方程中，不能很好地进行分类。因为不满足
这时，我们引入logistic
regression model：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5914.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
所谓Sigmoid
function或Logistic
function就是这样一个函数g(z)见上图所示
当z&=0时，g(z)&=0.5；当z&0时，g(z)&0.5
由下图中公式知，给定了数据x和参数θ，y=0和y=1的概率和=1
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5369.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
所谓Decision
Boundary就是能够将所有数据点进行很好地分类的h(x)边界。
如下图所示，假设形如h(x)=g(θ0+θ1x1+θ2x2)的hypothesis参数θ=[-3,1,1]T,
predict Y=1, if
-3+x1+x2&=0
predict Y=0, if
-3+x1+x2&0
刚好能够将图中所示数据集进行很好地分类
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_7505.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img STYLE="Line-HeiGHT: 24px" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_6699.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5596.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
除了线性boundary还有非线性decision
boundaries，比如<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_8627.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
下图中，进行分类的decision
boundary就是一个半径为1的圆，如图所示：
<img STYLE="Line-HeiGHT: 24px" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_7289.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
该部分讲述简化的logistic
regression系统中how to implement gradient descents for logistic
regression.
假设我们的数据点中y只会取0和1,
对于一个logistic regression model系统，有<img STYLE="Line-HeiGHT: 22px" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4370.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />，那么cost
function定义如下：
<img STYLE="Line-HeiGHT: 24px" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_3936.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
由于y只会取0,1，那么就可以写成
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1292.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
不信的话可以把y=0,y=1分别代入，可以发现这个J（θ）和上面的Cost(hθ(x),y)是一样的(*^__^*)
，那么剩下的工作就是求能最小化 J(θ)的θ了~
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_6677.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
在中我们已经讲了如何应用Gradient
Descent, 也就是下图Repeat中的部分，将θ中所有维同时进行更新，而J(θ)的导数可以由下面的式子求得，结果如下图手写所示：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4153.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
现在将其带入Repeat中：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_7555.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
这是我们惊奇的发现，它和第一章中我们得到的公式<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4768.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />是一样滴~
也就是说，下图中所示，不管h(x)的表达式是线性的还是logistic
regression model, 都能得到如下的参数更新过程。
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4711.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
那么如何用vectorization来做呢？换言之，我们不要用for循环一个个更新θj，而用一个矩阵乘法同时更新整个θ。也就是解决下面这个问题：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_9211.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
上面的公式给出了参数矩阵θ的更新，那么下面再问个问题，第二讲中说了如何判断学习率α大小是否合适，那么在logistic
regression系统中怎么评判呢？
Q：Suppose you are running gradient descent to
fit a logistic regression model with
parameter&θ∈Rn+1.
Which of the following is a reasonable way to make sure the
learning rate&α&is set properly and that
gradient descent is running
correctly?
A：<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_3644.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
这部分内容将对logistic
regression
做一些优化措施，使得能够更快地进行参数梯度下降。本段实现了matlab下用梯度方法计算最优参数的过程。
首先声明，除了gradient
方法之外，我们还有很多方法可以使用，如下图所示，左边是另外三种方法，右边是这三种方法共同的优缺点，无需选择学习率α，更快，但是更复杂。
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_8533.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
也就是matlab中已经帮我们实现好了一些优化参数θ的方法，那么这里我们需要完成的事情只是写好cost
function,并告诉系统，要用哪个方法进行最优化参数。比如我们用‘GradObj’，&Use
the GradObj option to specify&that FUN also returns a
second output argument G that is the partial&derivatives
of the function df/dX, at the point X.
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_3392.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
如上图所示，给定了参数θ，我们需要给出cost Function.
jVal 是 cost function
的表示，比如设有两个点（1,0,5）和（0,1,5）进行回归，那么就设方程为hθ(x)=θ1x1+θ2x2;
则有costfunction J(θ)：
jVal=(theta(1)-5)^2+(theta(2)-5)^2;
在每次迭代中，按照gradient
descent的方法更新参数θ：θ(i)-=gradient(i),其中gradient(i)是J(θ)对θi求导的函数式，在此例中就有gradient(1)=2*(theta(1)-5),&gradient(2)=2*(theta(2)-5)。如下面代码所示：
函数costFunction,
定义jVal=J(θ)和对两个θ的gradient：
function&[&jVal,gradient&]&=&costFunction(&theta&)&&
%COSTFUNCTION&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
jVal=&(theta(1)-5)^2+(theta(2)-5)^2;&&
gradient&=&zeros(2,1);&&
%code&to&compute&derivative&to&theta&&
gradient(1)&=&2&*&(theta(1)-5);&&
gradient(2)&=&2&*&(theta(2)-5);&&
编写函数Gradient_descent，进行参数优化
function&[optTheta,functionVal,exitFlag]=Gradient_descent(&)&&
%GRADIENT_DESCENT&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
&options&=&optimset('GradObj','on','MaxIter',100);&&
&initialTheta&=&zeros(2,1)&&
&[optTheta,functionVal,exitFlag]&=&fminunc(@costFunction,initialTheta,options);&&
matlab主窗口中调用，得到优化厚的参数(θ1,θ2)=(5,5),即hθ(x)=θ1x1+θ2x2=5*x1+5*x2
&[optTheta,functionVal,exitFlag]&=&Gradient_descent()&&
initialTheta&=&&
Local&minimum&found.&&
Optimization&completed&because&the&size&of&the&gradient&is&less&than&&
the&default&value&of&the&function&tolerance.&&
optTheta&=&&
functionVal&=&&
exitFlag&=&&
所谓one-vs-all method就是将binary分类的方法应用到多类分类中。
比如我想分成K类，那么就将其中一类作为positive，另（k-1）合起来作为negative，这样进行K个h(θ)的参数优化，每次得到的一个hθ(x)是指给定θ和x，它属于positive的类的概率。<img STYLE="TexT-ALiGn: Line-HeiGHT: 20px" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5132.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
按照上面这种方法，给定一个输入向量x，获得最大hθ(x)的类就是x所分到的类。
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_8657.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
第二部分：The problem of overfitting and how to solve
Problem of overfitting:
overfitting就是过拟合，如下图中最右边的那幅图。对于以上讲述的两类（logistic
regression和linear
regression）都有overfitting的问题，下面分别用两幅图进行解释：
<img STYLE="Line-HeiGHT: 20 CoLor: rgb(0,153,0)" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1647.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1796.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
怎样解决过拟合问题呢？两个方法：
减少feature个数（人工定义留多少个feature、算法选取这些feature）
规格化（留下所有的feature，但对于部分feature定义其parameter非常小）
下面我们将对regularization进行详细的讲解。
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5449.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
对于linear
regression model, 我们的问题是最小化
写作矩阵表示即
i.e. the loss
function can be written as
there we can
regularization, however,we have:
对于Regularization，方法如下，定义cost
function中θ3，θ4的parameter非常大，那么最小化cost
function后就有非常小的θ3,θ4了。
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_8466.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
写作公式如下，在cost
function中加入θ1~θn的惩罚项：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5271.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
这里要注意λ的设置，见下面这个题目：
Q:<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_6241.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
& A:λ很大会导致所有θ≈0
下面呢，我们分linear
regression 和 logistic
regression分别进行regularization步骤.
首先看一下，按照上面的cost
function的公式，如何应用gradient descent进行参数更新。
对于θ0，没有惩罚项，更新公式跟原来一样
对于其他θj，J(θ)对其求导后还要加上一项(λ/m)*θj，见下图：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4372.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
如果不使用梯度下降法（gradient descent+regularization），而是用矩阵计算（normal
equation）来求θ，也就求使J(θ)min的θ，令J(θ)对θj求导的所有导数等于0，有公式如下：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5770.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
而且已经证明，上面公式中括号内的东西是可逆的。
前面已经讲过Logisitic
Regression的cost function和overfitting的情况，如下图中所示:
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_5288.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
regression一样，我们给J(θ)加入关于θ的惩罚项来抑制过拟合：(注意,不惩罚theta0,只惩罚其他项)
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4509.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
用Gradient
Descent的方法，令J(θ)对θj求导都等于0，得到
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1795.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
这里我们发现，其实和线性回归的θ更新方法是一样的。
When using
regularized logistic regression, which of these is the best way to
monitor whether gradient descent is working
correctly?
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_6163.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
和上面matlab中调用那个例子相似，我们可以定义logistic
regression的cost function如下所示：
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://img.my.csdn.net/uploads//_9495.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
图中，jval表示cost
function 表达式，其中最后一项是参数θ的惩罚项；下面是对各θj求导的梯度，其中θ0没有在惩罚项中，因此gradient不变，θ1~θn分别多了一项(λ/m)*θj；
至此，regularization可以解决linear和logistic的overfitting
regression问题了~
本文为Maching Learning
栏目补充内容，为上几章中所提到、和&的总结版。旨在帮助大家更好地理解回归，所以我在Matlab中分别对他们予以实现，在本文中由易到难地逐个介绍。
本讲内容：
实现各种回归函数
=========================
Y=θ0+θ1X1型---线性回归（直线拟合）
解决过拟合问题---Regularization
Y=1/(1+e^X)型---逻辑回归（sigmod
函数拟合）
=========================
第一部分：基本模型
在解决拟合问题的解决之前，我们首先回忆一下线性回归和逻辑回归的基本模型。
设待拟合参数 θn*1 和输入参数[ xm*n, ym*1
对于各类拟合我们都要根据梯度下降的算法，给出两部分：
①&&&cost
function（指出真实值y与拟合值h之间的距离）：给出cost function 的表达式，每次迭代保证cost
function的量减小；给出梯度gradient，即cost
function对每一个参数θ的求导结果。
function [ jVal,gradient ] = costFunction (
②&&&Gradient_descent（主函数）：用来运行梯度下降算法，调用上面的cost
function进行不断迭代，直到最大迭代次数达到给定标准或者cost
function返回值不再减小。
[optTheta,functionVal,exitFlag]=Gradient_descent(
线性回归：拟合方程为hθ(x)=θ0x0+θ1x1+…+θnxn，当然也可以有xn的幂次方作为线性回归项（如<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_8627.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />），这与普通意义上的线性不同，而是类似多项式的概念。
其cost function
为：<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1654.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
逻辑回归：拟合方程为hθ(x)=1/(1+e^(θTx))，其cost
function 为：
<img STYLE="TexT-ALiGn: FonT-FAMiLY: 'Microsoft YaHei'" ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_1292.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
function对各θj的求导请自行求取，看最后一图，或者参见后文代码。
后面，我们分别对几个模型方程进行拟合，给出代码，并用matlab中的fit函数进行验证。
第二部分：Y=θ0+θ1X1型---线性回归（直线拟合）
在中我们已经讲过如何用matlab自带函数fit进行直线和曲线的拟合，非常实用。而这里我们是进行ML课程的学习，因此研究如何利用前面讲到的梯度下降法（gradient
descent）进行拟合。
function：
function&[&jVal,gradient&]&=&costFunction2(&theta&)&&
%COSTFUNCTION2&Summary&of&this&function&goes&here&&
%&&&linear&regression&-&&y=theta0&+&theta1*x&&
%&&&parameter:&x:m*n&&theta:n*1&&&y:m*1&&&(m=4,n=1)&&
�ta&&
x=[1;2;3;4];&&
y=[1.1;2.2;2.7;3.8];&&
m=size(x,1);&&
hypothesis&=&h_func(x,theta);&&
delta&=&hypothesis&-&y;&&
jVal=sum(delta.^2);&&
gradient(1)=sum(delta)/m;&&
gradient(2)=sum(delta.*x)/m;&&
其中，h_func是hypothesis的结果：
function&[res]&=&h_func(inputx,theta)&&
%H_FUNC&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
%cost&function&2&&
res=&theta(1)+theta(2)*
function&[res]&=&h_func(inputx,theta)&&
Gradient_descent：
function&[optTheta,functionVal,exitFlag]=Gradient_descent(&)&&
%GRADIENT_DESCENT&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
&&options&=&optimset('GradObj','on','MaxIter',100);&&
&&initialTheta&=&zeros(2,1);&&
&&[optTheta,functionVal,exitFlag]&=&fminunc(@costFunction2,initialTheta,options);&&
function [optTheta,functionVal,exitFlag]=Gradient_descent( )
%GRADIENT_DESCENT Summary of this function goes here
Detailed explanation goes here
options = optimset('GradObj','on','MaxIter',100);
initialTheta = zeros(2,1);
[optTheta,functionVal,exitFlag] = fminunc(@costFunction2,initialTheta,options);
&&&[optTheta,functionVal,exitFlag]&=&Gradient_descent()&&
Local&minimum&found.&&
Optimization&completed&because&the&size&of&the&gradient&is&less&than&&
the&default&value&of&the&function&tolerance.&&
optTheta&=&&
&&&&0.3000&&
&&&&0.8600&&
functionVal&=&&
&&&&0.0720&&
exitFlag&=&&
&& [optTheta,functionVal,exitFlag] = Gradient_descent()
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
optTheta =
functionVal =
exitFlag =
即得y=0.3+0.86x;
function&[&parameter&]&=&checkcostfunc(&&)&&
%CHECKC2&Summary&of&this&function&goes&here&&
%&&&check&if&the&cost&function&works&well&&
%&&&check&with&the&matlab&fit&function&as&standard&&
%check&cost&function&2&&
x=[1;2;3;4];&&
y=[1.1;2.2;2.7;3.8];&&
EXPR=&{'x','1'};&&
p=fittype(EXPR);&&
parameter=fit(x,y,p);&&
function [ parameter ] = checkcostfunc(
%CHECKC2 Summary of this function goes here
check if the cost function works well
check with the matlab fit function as standard
%check cost function 2
x=[1;2;3;4];
y=[1.1;2.2;2.7;3.8];
EXPR= {'x','1'};
p=fittype(EXPR);
parameter=fit(x,y,p);
运行结果：
&&&checkcostfunc()&&
&&&&&Linear&model:&&
&&&&&ans(x)&=&a*x&+&b&&
&&&&&Coefficients&(with&95%&confidence&bounds):&&
&&&&&&&a&=&&&&&&&&0.86&&(0.4949,&1.225)&&
&&&&&&&b&=&&&&&&&&&0.3&&(-0.6998,&1.3)&&
&& checkcostfunc()
Linear model:
ans(x) = a*x + b
Coefficients (with 95% confidence bounds):
和我们的结果一样。下面画图：
function&PlotFunc(&xstart,xend&)&&
%PLOTFUNC&Summary&of&this&function&goes&here&&
%&&&draw&original&data&and&the&fitted&&&
%===================cost&function&2====linear&regression&&
%original&data&&
x1=[1;2;3;4];&&
y1=[1.1;2.2;2.7;3.8];&&
%plot(x1,y1,'ro-','MarkerSize',10);&&
plot(x1,y1,'rx','MarkerSize',10);&&
%fitted&line&-&拟合曲线&&
x_co=xstart:0.1:&&
y_co=0.3+0.86*x_&&
%plot(x_co,y_co,'g');&&
plot(x_co,y_co);&&
function PlotFunc( xstart,xend )
%PLOTFUNC Summary of this function goes here
draw original data and the fitted
%===================cost function 2====linear regression
%original data
x1=[1;2;3;4];
y1=[1.1;2.2;2.7;3.8];
%plot(x1,y1,'ro-','MarkerSize',10);
plot(x1,y1,'rx','MarkerSize',10);
%fitted line - 拟合曲线
x_co=xstart:0.1:
y_co=0.3+0.86*x_
%plot(x_co,y_co,'g');
plot(x_co,y_co);
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_7379.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
第三部分：解决过拟合问题---Regularization
过拟合问题解决方法我们已在第三章中讲过，利用Regularization的方法就是在cost
function中加入关于θ的项，使得部分θ的值偏小，从而达到fit效果。
例如定义costfunction
J(θ)： jVal=(theta(1)-5)^2+(theta(2)-5)^2;
在每次迭代中，按照gradient
descent的方法更新参数θ：θ(i)-=gradient(i),其中gradient(i)是J(θ)对θi求导的函数式，在此例中就有gradient(1)=2*(theta(1)-5),&gradient(2)=2*(theta(2)-5)。
函数costFunction, 定义jVal=J(θ)和对两个θ的gradient：
function&[&jVal,gradient&]&=&costFunction(&theta&)&&
%COSTFUNCTION&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
jVal=&(theta(1)-5)^2+(theta(2)-5)^2;&&
gradient&=&zeros(2,1);&&
%code&to&compute&derivative&to&theta&&
gradient(1)&=&2&*&(theta(1)-5);&&
gradient(2)&=&2&*&(theta(2)-5);&&
function [ jVal,gradient ] = costFunction( theta )
%COSTFUNCTION Summary of this function goes here
Detailed explanation goes here
jVal= (theta(1)-5)^2+(theta(2)-5)^2;
gradient = zeros(2,1);
%code to compute derivative to theta
gradient(1) = 2 * (theta(1)-5);
gradient(2) = 2 * (theta(2)-5);
Gradient_descent，进行参数优化
function&[optTheta,functionVal,exitFlag]=Gradient_descent(&)&&
%GRADIENT_DESCENT&Summary&of&this&function&goes&here&&
%&&&Detailed&explanation&goes&here&&
&options&=&optimset('GradObj','on','MaxIter',100);&&
&initialTheta&=&zeros(2,1)&&
&[optTheta,functionVal,exitFlag]&=&fminunc(@costFunction,initialTheta,options);&&
function [optTheta,functionVal,exitFlag]=Gradient_descent( )
%GRADIENT_DESCENT Summary of this function goes here
Detailed explanation goes here
options = optimset('GradObj','on','MaxIter',100);
initialTheta = zeros(2,1)
[optTheta,functionVal,exitFlag] = fminunc(@costFunction,initialTheta,options);
matlab主窗口中调用，得到优化厚的参数(θ1,θ2)=(5,5)
&[optTheta,functionVal,exitFlag]&=&Gradient_descent()&&
initialTheta&=&&
Local&minimum&found.&&
Optimization&completed&because&the&size&of&the&gradient&is&less&than&&
the&default&value&of&the&function&tolerance.&&
optTheta&=&&
functionVal&=&&
exitFlag&=&&
[optTheta,functionVal,exitFlag] = Gradient_descent()
initialTheta =
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
optTheta =
functionVal =
exitFlag =
第四部分：Y=1/(1+e^X)型---逻辑回归（sigmod
函数拟合）
hypothesis function:
function&[res]&=&h_func(inputx,theta)&&
%cost&function&3&&
tmp=theta(1)+theta(2)*%m*1&&
res=1./(1+exp(-tmp));%m*1&&
function [res] = h_func(inputx,theta)
%cost function 3
tmp=theta(1)+theta(2)*%m*1
res=1./(1+exp(-tmp));%m*1
cost function:
function&[&jVal,gradient&]&=&costFunction3(&theta&)&&
%COSTFUNCTION3&Summary&of&this&function&goes&here&&
%&&&Logistic&Regression&&
x=[-3;&&&&&&-2;&&&&&-1;&&&&&0;&&&&&&1;&&&&&&2;&&&&&3];&&
y=[0.01;&&&&0.05;&&&0.3;&&&&0.45;&&&0.8;&&&&1.1;&&&&0.99];&&
m=size(x,1);&&
%hypothesis&&data&&
hypothesis&=&h_func(x,theta);&&
%jVal-cost&function&&&&&gradient&updating&&
jVal=-sum(log(hypothesis+0.01).*y&+&(1-y).*log(1-hypothesis+0.01))/m;&&
gradient(1)=sum(hypothesis-y)/m;&&&%reflect&to&theta1&&
gradient(2)=sum((hypothesis-y).*x)/m;&&&&%reflect&to&theta&2&&
function [ jVal,gradient ] = costFunction3( theta )
%COSTFUNCTION3 Summary of this function goes here
Logistic Regression
m=size(x,1);
%hypothesis
hypothesis = h_func(x,theta);
%jVal-cost function
gradient updating
jVal=-sum(log(hypothesis+0.01).*y + (1-y).*log(1-hypothesis+0.01))/m;
gradient(1)=sum(hypothesis-y)/m;
%reflect to theta1
gradient(2)=sum((hypothesis-y).*x)/m;
%reflect to theta 2
Gradient_descent:
function&[optTheta,functionVal,exitFlag]=Gradient_descent(&)&&
&options&=&optimset('GradObj','on','MaxIter',100);&&
&initialTheta&=&[0;0];&&
&[optTheta,functionVal,exitFlag]&=&fminunc(@costFunction3,initialTheta,options);&&
function [optTheta,functionVal,exitFlag]=Gradient_descent( )
options = optimset('GradObj','on','MaxIter',100);
initialTheta = [0;0];
[optTheta,functionVal,exitFlag] = fminunc(@costFunction3,initialTheta,options);
运行结果：
&[optTheta,functionVal,exitFlag]&=&Gradient_descent()&&
Local&minimum&found.&&
Optimization&completed&because&the&size&of&the&gradient&is&less&than&&
the&default&value&of&the&function&tolerance.&&
optTheta&=&&
&&&&0.3526&&
&&&&1.7573&&
functionVal&=&&
&&&&0.2498&&
exitFlag&=&&
[optTheta,functionVal,exitFlag] = Gradient_descent()
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the function tolerance.
optTheta =
functionVal =
exitFlag =
画图验证：
function&PlotFunc(&xstart,xend&)&&
%PLOTFUNC&Summary&of&this&function&goes&here&&
%&&&draw&original&data&and&the&fitted&&&
%===================cost&function&3=====logistic&regression&&
%original&data&&
x=[-3;&&&&&&-2;&&&&&-1;&&&&&0;&&&&&&1;&&&&&&2;&&&&&3];&&
y=[0.01;&&&&0.05;&&&0.3;&&&&0.45;&&&0.8;&&&&1.1;&&&&0.99];&&
plot(x,y,'rx','MarkerSize',10);&&
%fitted&line&&
x_co=xstart:0.1:&&
theta&=&[0.3];&&
y_co=h_func(x_co,theta);&&
plot(x_co,y_co);&&
hold&off&&
function PlotFunc( xstart,xend )
%PLOTFUNC Summary of this function goes here
draw original data and the fitted
%===================cost function 3=====logistic regression
%original data
plot(x,y,'rx','MarkerSize',10);
%fitted line
x_co=xstart:0.1:
theta = [0.3];
y_co=h_func(x_co,theta);
plot(x_co,y_co);
<img ALT="" src="http://simg.sinajs.cn/blog7style/images/common/sg_trans.gif" real_src ="http://my.csdn.net/uploads//_4751.jpg"
TITLE="Logistic&regression&（逻辑回归）&概述" />
关于Machine Learning更多的学习资料将继续更新，敬请关注本博客和新浪微博。
==============================
Reference:
已投稿到：
以上网友发言只代表其个人观点，不代表新浪网的观点或立场。

求助一多分类logistic回归归问题

我要回帖

更多关于 logistic回归模型的文章

随机推荐

求助一多分类logistic回归归问题

我要回帖

更多关于 logistic回归模型 的文章

随机推荐

更多关于 logistic回归模型的文章