聚合国内IT技术精华文章,分享IT技术精华,帮助IT从业人士成长

Docker容器内多用户版JupyterHub支持GPU

2019-12-28 23:30 浏览: 3351 次 我要评论(0 条) 字号:

前一篇文章介绍了Docker下安装多用户版的JupyterHub,但安装完毕后会遇到一个问题:Docker内无法使用GPU,这对JupyterHub来说是致命的。今天就来一起梳理下如何解决这个问题。

nvidia-docker

原以为nvidia docker是最佳解决方案,安装完nvidia-docker后在运行Docker时加上 –gpu all指令让容器支持GPU,但是该实现方案只是让Jupyterhub的容易可支持GPU,针对多用户版本的JupyterHub,每个用户会生成一个单独的容器。而单用户容器是由DockerSpawner由API create_container生成的。create_container并不支持—gpus参数:https://github.com/docker/docker-py/issues/2395

解决方案:

1、卸载Docker 19.03,降级安装18.09版本的Docker:

services stop docker
rm -rf /var/lib/docker
yum remove docker*
yum -y install docker-ce-18.09.0 docker-ce-cli-18.09.0 containerd.io

2、安装旧版nvidia-docker,即nvidia-docker2

如果以前安装过nvidia-docker 1.0版本,需要先将其删除:

docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
yum remove nvidia-docker

添加相关库并进行安装

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-docker2

配置nvidia-docker2,把默认的Runtime设为nvidia。

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia",
}

以上内容加入/etc/docker/daemon.json文件中,然后重启dockerd。

jupyterhub/singleuser

jupyterhub/singleuser本身没有安装任何显卡驱动,解决方案是重新进行Build。

jupyterhub/singleuser的Dockerfile我们可以看到它的BASE_IMAGE为jupyter/base-notebook。

# Build as jupyterhub/singleuser
# Run with the DockerSpawner in JupyterHub

ARG BASE_IMAGE=jupyter/base-notebook
FROM $BASE_IMAGE
MAINTAINER Project Jupyter <jupyter@googlegroups.com>

ADD install_jupyterhub /tmp/install_jupyterhub
ARG JUPYTERHUB_VERSION=master
# install pinned jupyterhub and ensure notebook is installed
RUN python3 /tmp/install_jupyterhub && 
    python3 -m pip install notebook

再来看下jupyter/base-notebook的Dockerfile

# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.

# Ubuntu 18.04 (bionic) from 2019-10-29
# https://github.com/tianon/docker-brew-ubuntu-core/commit/d4313e13366d24a97bd178db4450f63e221803f1
ARG BASE_CONTAINER=ubuntu:bionic-20191029@sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
FROM $BASE_CONTAINER

LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
ARG NB_USER="jovyan"
ARG NB_UID="1000"
ARG NB_GID="100"

USER root

# Install all OS dependencies for notebook server that starts but lacks all
# features (e.g., download as all possible file formats)
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update 
 && apt-get install -yq --no-install-recommends 
    wget 
    bzip2 
    ca-certificates 
    sudo 
    locales 
    fonts-liberation 
    run-one 
 && apt-get clean && rm -rf /var/lib/apt/lists/*

RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && 
    locale-gen

# Configure environment
ENV CONDA_DIR=/opt/conda 
    SHELL=/bin/bash 
    NB_USER=$NB_USER 
    NB_UID=$NB_UID 
    NB_GID=$NB_GID 
    LC_ALL=en_US.UTF-8 
    LANG=en_US.UTF-8 
    LANGUAGE=en_US.UTF-8
ENV PATH=$CONDA_DIR/bin:$PATH 
    HOME=/home/$NB_USER

# Add a script that we will use to correct permissions after running certain commands
ADD fix-permissions /usr/local/bin/fix-permissions
RUN chmod a+rx /usr/local/bin/fix-permissions

# Enable prompt color in the skeleton .bashrc before creating the default NB_USER
RUN sed -i 's/^#force_color_prompt=yes/force_color_prompt=yes/' /etc/skel/.bashrc

# Create NB_USER wtih name jovyan user with UID=1000 and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN echo "auth requisite pam_deny.so" >> /etc/pam.d/su && 
    sed -i.bak -e 's/^%admin/#%admin/' /etc/sudoers && 
    sed -i.bak -e 's/^%sudo/#%sudo/' /etc/sudoers && 
    useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && 
    mkdir -p $CONDA_DIR && 
    chown $NB_USER:$NB_GID $CONDA_DIR && 
    chmod g+w /etc/passwd && 
    fix-permissions $HOME && 
    fix-permissions "$(dirname $CONDA_DIR)"

USER $NB_UID
WORKDIR $HOME
ARG PYTHON_VERSION=default

# Setup work directory for backward-compatibility
RUN mkdir /home/$NB_USER/work && 
    fix-permissions /home/$NB_USER

# Install conda as jovyan and check the md5 sum provided on the download site
ENV MINICONDA_VERSION=4.7.10 
    MINICONDA_MD5=1c945f2b3335c7b2b15130b1b2dc5cf4 
    CONDA_VERSION=4.7.12

RUN cd /tmp && 
    wget --quiet https://repo.continuum.io/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && 
    echo "${MINICONDA_MD5} *Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh" | md5sum -c - && 
    /bin/bash Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh -f -b -p $CONDA_DIR && 
    rm Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && 
    echo "conda ${CONDA_VERSION}" >> $CONDA_DIR/conda-meta/pinned && 
    $CONDA_DIR/bin/conda config --system --prepend channels conda-forge && 
    $CONDA_DIR/bin/conda config --system --set auto_update_conda false && 
    $CONDA_DIR/bin/conda config --system --set show_channel_urls true && 
    if [ ! $PYTHON_VERSION = 'default' ]; then conda install --yes python=$PYTHON_VERSION; fi && 
    conda list python | grep '^python ' | tr -s ' ' | cut -d '.' -f 1,2 | sed 's/$/.*/' >> $CONDA_DIR/conda-meta/pinned && 
    $CONDA_DIR/bin/conda install --quiet --yes conda && 
    $CONDA_DIR/bin/conda update --all --quiet --yes && 
    conda clean --all -f -y && 
    rm -rf /home/$NB_USER/.cache/yarn && 
    fix-permissions $CONDA_DIR && 
    fix-permissions /home/$NB_USER

# Install Tini
RUN conda install --quiet --yes 'tini=0.18.0' && 
    conda list tini | grep tini | tr -s ' ' | cut -d ' ' -f 1,2 >> $CONDA_DIR/conda-meta/pinned && 
    conda clean --all -f -y && 
    fix-permissions $CONDA_DIR && 
    fix-permissions /home/$NB_USER

# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Cleanup temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN conda install --quiet --yes 
    'notebook=6.0.0' 
    'jupyterhub=1.0.0' 
    'jupyterlab=1.2.1' && 
    conda clean --all -f -y && 
    npm cache clean --force && 
    jupyter notebook --generate-config && 
    rm -rf $CONDA_DIR/share/jupyter/lab/staging && 
    rm -rf /home/$NB_USER/.cache/yarn && 
    fix-permissions $CONDA_DIR && 
    fix-permissions /home/$NB_USER

EXPOSE 8888

# Configure container startup
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]

# Add local files as late as possible to avoid cache busting
COPY start.sh /usr/local/bin/
COPY start-notebook.sh /usr/local/bin/
COPY start-singleuser.sh /usr/local/bin/
COPY jupyter_notebook_config.py /etc/jupyter/

# Fix permissions on /etc/jupyter as root
USER root
RUN fix-permissions /etc/jupyter/

# Switch back to jovyan to avoid accidental container runs as root
USER $NB_UID

解决方案:修改jupyter/base-notebook的BASE IMAGE为:nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04

Dockerfile中涉及到的相关文件可从https://github.com/jupyter/docker-stacks/tree/master/base-notebook 获取。



网友评论已有0条评论, 我也要评论

发表评论

*

* (保密)

Ctrl+Enter 快捷回复