如何在NVIIA Jeston K1上pathon安装包CUDA环境

如何在NVIIA&Jeston&K1上安装CUDA环境
You have two options for developing CUDA applications for Jetson
native compilation (compiling code onboard
the&Jetson
cross-compilation (compiling code on an x86 desktop in a
special way so it can execute on
the&&target&&Jetson
TK1&device).
Native compilation is generally the easiest option, but takes
longer to compile, whereas cross-compilation is typically more
complex to configure and debug, but for large projects it will be
noticeably faster at compiling. The CUDA Toolkit currently only
supports cross-compilation from an Ubuntu 12.04 Linux desktop. In
comparison, native compilation happens onboard the Jetson device
and thus is the same no matter which OS or desktop you have.
Installing the CUDA Toolkit onto your device for native
CUDA development
Download the .deb file for the . (Make sure you download the Toolkit for L4T and not
the Toolkit for Ubuntu since that is for cross-compilation
instead of native compilation). You will need to register & log
in first before downloading, so the easiest way is perhaps to
download the file on your PC. Then if you want to copy the file to
your device you can copy it onto a USB flash stick then plug it
into the device, or transfer it through your local network such as
by running this on a Linux PC:
scp ~/Downloads/cuda-repo-l4t-wr19.2_6.0-42_armhf.deb ubuntu@tegra-ubuntu:Downloads/.
On the device, install the .deb file and the CUDA Toolkit.
cd ~/Downloads
# Install the CUDA repo metadata that you downloaded manually for L4T
sudo dpkg -i cuda-repo-l4t-r19.2_6.0-42_armhf.deb
# Download & install the actual CUDA Toolkit including the OpenGL toolkit from NVIDIA. (It only downloads around 15MB)
sudo apt-get update
sudo apt-get install cuda-toolkit-6-0
# Add yourself to the "video" group to allow access to the GPU
sudo usermod -a -G video $USER
Add the 32-bit CUDA paths to your .bashrc login script, and
start using it in your current console:
echo "# Add CUDA bin & library paths:" && ~/.bashrc
echo "export PATH=/usr/local/cuda-6.0/bin:$PATH" && ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda-6.0/lib:$LD_LIBRARY_PATH" && ~/.bashrc
source ~/.bashrc
Verify that the CUDA Toolkit is installed on your device:
Installing & running the CUDA samples
(optional)
If you think you will write your own CUDA code or you want to
see what CUDA can do, then follow this section to build & run
some of the CUDA samples.
Install writeable copies of the CUDA samples to your device's
home directory (it will create a "NVIDIA_CUDA-6.0_Samples"
cuda-install-samples-6.0.sh /home/ubuntu
Build the CUDA samples (takes around 15 minutes on&Jetson TK1):
cd ~/NVIDIA_CUDA-6.0_Samples
Run some CUDA samples:
1_Utilities/deviceQuery/deviceQuery
1_Utilities/bandwidthTest/bandwidthTest
cd 0_Simple/matrixMul
./matrixMulCUBLAS
cd 0_Simple/simpleTexture
./simpleTexture
cd 3_Imaging/convolutionSeparable
./convolutionSeparable
cd 3_Imaging/convolutionTexture
./convolutionTexture
Note: Many of the CUDA samples use OpenGL GLX and open graphical
windows. If you are running these programs through an SSH remote
terminal, you can remotely display the windows on your desktop by
typing "export DISPLAY=:0" and then executing the program. (This
will only work if you are using a Linux/Unix machine or you run an
X server such as the free "Xming" for Windows). eg:
export DISPLAY=:0
cd ~/NVIDIA_CUDA-6.0_Samples/2_Graphics/simpleGL
./simpleGL
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bicubicTexture
./bicubicTexture
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bilateralFilter
./bilateralFilter
Note: the Optical Flow sample (HSOpticalFlow) and 3D stereo
sample (stereoDisparity) take rougly 1 minute each to execute since
they compare results with CPU code.
已投稿到:
以上网友发言只代表其个人观点,不代表新浪网的观点或立场。首先要声明一下,本篇博文是编译tensorflow r0.9,如果你是想跑tensorflow版本的facenet,因为最新的model是基于tensorflowr0.11编译的,所以不会运行成功。本文也是踩了这个坑,只编译成功了r0.9。
参看链接://Installation-of-TensorFlow-r0-11-on-TX1/
/questions//tensorflow-on-nvidia-tx1/
Tensorflow编译:
你需要尽可能的删除一切东西,否则如果在本机编译,空间会不够,包很多错误,要么就给板子加个固态硬盘,要么就在移动硬盘下编译。建议编译的过程在代理下进行,如果你想编译r0.11版本,可参考上边第一个链接,本人没成功。
删除一切能删的:
# get rid of liboffice, games, libvisionworks, perfkit, multimedia api, opencv4tegra, etc.
sudo apt-get purge libreoffice*
sudo apt-get purge aisleriot gnome-sudoku mahjongg ace-of-penguins gnomine gbrainy
sudo apt-get clean
sudo apt-get autoremove
rm -rf libvision*
rm -rf PerfKit*
# somethi might be different for you
# delete all libvision-works and opencv4tegra stuff
cd var && rm -rf libopencv4tegra* && rm -rf libvision*
# I deleted practically everything. Almost as if I shouldn't have even installed JetPack in the first place
# delete all deb files, Firefox, chrome, all the stuff I really didn't need that was taking up memory.
# find big files and remove them assuming they're not important. Google is your friend.
find / -size +10M -ls
另外在var下,有很多deb的包,可以都删除,会节约很多空间的。
安装protobuf,bazel与tensorflow:
# install deps
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install git zip unzip autoconf automake libtool curl zlib1g-dev maven swig bzip2
build protobuf 3.0.0-beta-2 jar
git clone /google/protobuf.git
cd protobuf
# autogen.sh downloads broken gmock.zip in d5fb408d
git checkout master
./autogen.sh
git checkout d5fb408d
./configure --prefix=/usr
sudo make install
mvn package
#Get bazel version 0.2.1, it doesn't require gRPC
git clone /bazelbuild/bazel.git
git checkout 0.2.1
cp /usr/bin/protoc third_party/protobuf/protoc-linux-arm32.exe
cp ../protobuf/java/target/protobuf-java-3.0.0-beta-2.jar third_party/protobuf/protobuf-java-3.0.0-beta-1.jar
编辑bazel使其识别aarch64:
--- a/src/main/java/com/google/devtools/build/lib/util/CPU.java
+++ b/src/main/java/com/google/devtools/build/lib/util/CPU.java
@@ -25,7 +25,7 @@ import java.util.S
public enum CPU {
X86_32(&x86_32&, ImmutableSet.of(&i386&, &i486&, &i586&, &i686&, &i786&, &x86&)),
X86_64(&x86_64&, ImmutableSet.of(&amd64&, &x86_64&, &x64&)),
ARM(&arm&, ImmutableSet.of(&arm&, &armv7l&)),
ARM(&arm&, ImmutableSet.of(&arm&, &armv7l&, &aarch64&)),
UNKNOWN(&unknown&, ImmutableSet.of());
./compile.sh
git tensorflow:
git clone -b r0.9 /tensorflow/tensorflow.git
./configure
# this will fail, but that's ok
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package下载config,更新.cache:
wget -O config.guess 'http://git.savannah.gnu.org/gitweb/?p=config.a=blob_f=config.hb=HEAD'
wget -O config.sub 'http://git.savannah.gnu.org/gitweb/?p=config.a=blob_f=config.hb=HEAD'
# below are commands Dwight Crowe ran, yours will vary depending on .cache details.
# look for '_bazel_ubuntu', 'farmhash_archive', and 'farmhash'
cp config.guess ./.cache/bazel/_bazel_ubuntu/742c01ff31b60b1eed9f/external/farmhash_archive/farmhash-34c13ddfab0e9fc050260/config.guess
cp config.sub ./.cache/bazel/_bazel_ubuntu/742c01ff31b60b1eed9f/external/farmhash_archive/farmhash-34c13ddfab0e9fc050260/config.sub
修改tensoflow源文件:
--- a/tensorflow/core/kernels/BUILD
+++ b/tensorflow/core/kernels/BUILD
@@ -985,7 +985,7 @@ tf_kernel_libraries(
&reduction_ops&,
&segment_reduction_ops&,
&sequence_ops&,
&sparse_matmul_op&,
#DC &sparse_matmul_op&,
&:bounds_check&,
--- a/tensorflow/python/BUILD
+++ b/tensorflow/python/BUILD
@@ -10,7 @@ medium_kernel_test_list = glob([
&kernel_tests/seq2seq_test.py&,
&kernel_tests/slice_op_test.py&,
&kernel_tests/sparse_ops_test.py&,
&kernel_tests/sparse_matmul_op_test.py&,
#DC &kernel_tests/sparse_matmul_op_test.py&,
&kernel_tests/sparse_tensor_dense_matmul_op_test.py&,
--- a/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc
+++ b/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc
@@ -43,8 +43,14 @@ struct BatchSelectFunctor {
const int all_but_batch = then_flat_outer_dims.dimension(1);
#if !defined(EIGEN_HAS_INDEX_LIST)
Eigen::array broadcast_dims{{ 1, all_but_batch }};
Eigen::Tensor::Dimensions reshape_dims{{ batch, 1 }};
//DC Eigen::array broadcast_dims{{ 1, all_but_batch }};
Eigen::array broadcast_
broadcast_dims[0] = 1;
broadcast_dims[1] = all_but_
//DC Eigen::Tensor::Dimensions reshape_dims{{ batch, 1 }};
Eigen::Tensor::Dimensions reshape_
reshape_dims[0] =
reshape_dims[1] = 1;
Eigen::IndexList, int& broadcast_
broadcast_dims.set(1, all_but_batch);
--- a/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc
+++ b/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc
@@ -104,9 +104,17 @@ struct SparseTensorDenseMatMulFunctor {
int n = (ADJ_B) ? b.dimension(0) : b.dimension(1);
#if !defined(EIGEN_HAS_INDEX_LIST)
Eigen::Tensor::Dimensions matrix_1_by_nnz{{ 1, nnz }};
Eigen::array n_by_1{{ n, 1 }};
Eigen::array reduce_on_rows{{ 0 }};
//DC Eigen::Tensor::Dimensions matrix_1_by_nnz{{ 1, nnz }};
Eigen::Tensor::Dimensions matrix_1_by_
matrix_1_by_nnz[0] = 1;
matrix_1_by_nnz[1] =
//DC Eigen::array n_by_1{{ n, 1 }};
Eigen::array n_by_1;
n_by_1[0] =
n_by_1[1] = 1;
//DC Eigen::array reduce_on_rows{{ 0 }};
Eigen::array reduce_on_
reduce_on_rows[0] = 0;
Eigen::IndexList, int& matrix_1_by_
matrix_1_by_nnz.set(1, nnz);
--- a/tensorflow/stream_executor/cuda/cuda_blas.cc
+++ b/tensorflow/stream_executor/cuda/cuda_blas.cc
@@ -25,6 +25,12 @@ limitations under the License.
#define EIGEN_HAS_CUDA_FP16
+#if CUDA_VERSION &= 8000
+#define SE_CUDA_DATA_HALF CUDA_R_16F
+#define SE_CUDA_DATA_HALF CUBLAS_DATA_HALF
#include &tensorflow/stream_executor/cuda/cuda_blas.h&
@@ -86,10 @@ bool CUDABlas::DoBlasGemm(
return DoBlasInternal(
dynload::cublasSgemmEx, stream, true /* = pointer_mode_host */,
CUDABlasTranspose(transa), CUDABlasTranspose(transb), m, n, k, &alpha,
CUDAMemory(a), CUBLAS_DATA_HALF, lda,
CUDAMemory(b), CUBLAS_DATA_HALF, ldb,
CUDAMemory(a), SE_CUDA_DATA_HALF, lda,
CUDAMemory(b), SE_CUDA_DATA_HALF, ldb,
CUDAMemoryMutable(c), CUBLAS_DATA_HALF, ldc);
CUDAMemoryMutable(c), SE_CUDA_DATA_HALF, ldc);
LOG(ERROR) && &fp16 sgemm is not implemented in this cuBLAS version &
&& &(need at least CUDA 7.5)&;
--- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
+++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
@@ -888,6 +888,9 @@ CudaContext* CUDAExecutor::cuda_context() { return context_; }
// For anything more complicated/prod-focused than this, you'll likely want to
// turn to gsys' topology modeling.
static int TryToReadNumaNode(const string &pci_bus_id, int device_ordinal) {
// DC - make this clever later. ARM has no NUMA node, just return 0
LOG(INFO) && &ARM has no NUMA node, hardcoding to return zero&;
#if defined(__APPLE__)
LOG(INFO) && &OS X does not support NUMA - returning NUMA node zero&;
bazel build -c opt --config=cuda --local_resources ,1.0 --verbose_resources //tensorflow/tools/pip_package:build_pip_package --jobs 4
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
sudo pip install /tmp/tensorflow_pkg/tensorflow-0.12.0rc0-py2-none-any.whl
下面是我编译好的tensorflow0.9版本:
链接: /s/1hrPd4FE 密码: qe5x
编译过程中可能会报错,多尝试几次,将jobs 换为3或4,编译过程中可能会报的错:
unexpected EOF from Bazel server.
internal compiler error: Killed (program cc1plus)
这些都是因为内存不够,我是将deb全部删掉,在移动硬盘中编译,还碰到了cross_tool的错误,多换了几次jobs就成功了。
总之,这次的编译过程让我累死了,没啥收获。为了编译r0.11,还把0.9删了,结果0.9的安装包我还没保存,最后啥都没剩下。
点个赞咯,草稿快要写完时,没保存,火狐就崩了,又写了一遍,真倒霉,股市又因为加息的事大跌,虽然我两个月前就知道会这样,最近太忙,忘了卖,又亏惨了,我的人生啊!!!。。。。。
如果有人编译成功r0.11,请告诉下哈,本人不甘心。
本文已收录于以下专栏:
相关文章推荐
本文主要分为两部分,分别是基于树莓派的实时目标识别与移动监测系统。两部分主要根据两个具体的教程进行讲解。本篇重点讲解配置过程,其中第一个主要是由mxnet官方的文档而来,起初进行这次测试的目的是因为看...
我的JetsonTX1刷的是JetPack-L4T-2.3.1(JetPack-L4T-2.1在这里会有各种问题)1. 编译安装protobuf 3.3.0先装一些依赖库$ sudo apt-get ...
由于TX1内存以及存储空间的限制,安装tensorflow需要额外的资源
转载的视频使用的时SSD,通过实际操作,发现SD卡也是可以的
网站为/...
英伟达jetson tx1开发套件配置tensorflow
本文为原创作品,未经本人同意,禁止转载,禁止用于商业用途!本人对博客拥有最终解释权
欢迎关注我的博客:http://blog.csdn...
GTX1080因为其高性价比,被许多人买来用于进行并行计算用。tensorflow是当今最流行的深度学习库之一。很多人希望使用基于GPU计算的tensorflow来进行研究。tensorflow使用p...
作者:董豪
链接:/question//answer/
来源:知乎
著作权归作者所有,转载请联系作者获得授权。
OPENCV3.2中新增加的dnn模块有本人想要使用的demo(使用tensorflow中inception类做目标识别),因此花费三天分别在香蕉派(类似树莓派)和电脑(ubuntu14.04)上配置...
Prepare your machine
https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#installing...
前言2013年DeepMind 在NIPS上发表Playing Atari with Deep Reinforcement Learning 一文,提出了DQN(Deep Q Network)算法,实...
Jetson TX1是英伟达公司新出的GPU开发板,拥有世界上先进的嵌入式视觉计算系统,提供高性能、新技术和极佳的开发平台。
    
    由于Jetson TX1找不到太多的教程,所以我都是...
他的最新文章
讲师:王哲涵
讲师:韦玮
您举报文章:
举报原因:
原文地址:
原因补充:
(最多只允许输入30个字)

我要回帖

更多关于 onos 1.9 安装 的文章

 

随机推荐