使用 kops 在 AWS 部署 Kubernetes 集群

发表于 2018-01-11 | 分类于虚拟化 | 阅读次数

　　kops 是官方推荐的工具，用来在 AWS 生产环境中，快速地部署 Kubernetes 集群。

环境准备

　　在 1.6.2 版本之前，通过 kops 部署 K8s 集群，需要使用 AWS 的 Route53 来提供 DNS 服务的功能。但从 1.6.2 版本开始，kops 支持部署基于 gossip 的集群，不再依赖 Route53，这让部署操作变得更加简单。
　　在部署集群之前，需要安装 kubectl、kops 和 awscli 这些工具，下面是安装步骤：

$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod +x kubectl
$ sudo mv kubectl /usr/local/bin/
$ wget https://github.com/kubernetes/kops/releases/download/1.8.0/kops-linux-amd64
$ chmod +x kops-linux-amd64
$ sudo mv kops-linux-amd64 /usr/local/bin/kops
$ sudo apt-get install awscli
$ sudo pip install awscli

　　配置好 AWS 账号：

$ aws configure
AWS Access Key ID [None]: <your-accesskeyID>
AWS Secret Access Key [None]: <your-secretAccessKey>
Default region name [None]: ap-northeast-1
Default output format [None]: json

　　为了使用 kops 部署集群，还需要为 kops 创建一个 IAM 用户kops，并分配相应的权限：

$ aws iam create-group --group-name kops
$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess --group-name kops
$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonRoute53FullAccess --group-name kops
$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess --group-name kops
$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/IAMFullAccess --group-name kops
$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonVPCFullAccess --group-name kops
$ aws iam create-user --user-name kops
$ aws iam add-user-to-group --user-name kops --group-name kops

　　为kops用户创建密钥：

1	$ aws iam create-access-key --user-name kops

　　上面的命令会返回kops用户的AccessKeyID和SecretAccessKey。接着我们就可以更新awscli的配置，让它使用新创建的kops用户的密钥：

$ aws configure
AWS Access Key ID [None]: <accesskeyID-of-kops-user>
AWS Secret Access Key [None]: <secretAccessKey-of-kops-user>
Default region name [None]: ap-northeast-1
Default output format [None]: json

　　同时还需要将kops用户的密钥导出到命令行的环境变量：

1
2
3

$ export AWS_ACCESS_KEY_ID=$(aws configure get aws_access_key_id)
$ export AWS_SECRET_ACCESS_KEY=$(aws configure get aws_secret_access_key)
$ export AWS_REGION=$(aws configure get region)

　　最后是生成 SSH 密钥：

1	$ ssh-keygen

配置 S3

　　需要注意，为了让 kops 创建基于 gossip 的集群，集群的命名需要使用.k8s.local作为后缀，例如，这里我们将集群命名为cluster.k8s.local：

1	$ export NAME=cluster.k8s.local

　　接着创建一个 S3 bucket，用户存储集群的数据，例如，这里我们将这个 bucket 命名为cluster.k8s.local-state.ym：

1 2	$ aws s3api create-bucket --bucket ${NAME}-state.ym --create-bucket-configuration LocationConstraint=$AWS_REGION $ export KOPS_STATE_STORE=s3://cluster.k8s.local-state.ym

阅读全文 »

使用 gdb 调试 C++ 多线程程序

发表于 2018-01-01 | 分类于性能剖析 | 阅读次数

调试死锁程序

　　分析死锁问题是比较简单的，因为当发生死锁时，进程会僵住，这时我们只需要杀死进程，让系统产生一个 core dump 文件，然后再对这个 core dump 文件进行分析即可。至于如何才能在 Linux 生成 core dump 文件，可以参见这篇文章。
　　下面的 C++ 程序，有可能出现两个线程都在等待对方释放互斥锁，从而导致死锁：

#include <iostream>
#include <thread>
#include <mutex>
std::mutex m1, m2;
void foo()
{
    std::lock_guard<std::mutex> g1(m1);
    std::lock_guard<std::mutex> g2(m2);
    std::cout << "I'm foo!" << std::endl;
}
void bar()
{
    std::lock_guard<std::mutex> g1(m2);
    std::lock_guard<std::mutex> g2(m1);
    std::cout << "I'm bar!" << std::endl;
}
int main()
{
    std::thread t1(foo);
    std::thread t2(bar);
    t1.join();
    t2.join();
    return 0;
}

　　编译好这个程序，并运行：

$ g++ -std=c++11 -g -o main main.cpp -lpthread
$ while true ; do
> ./main 
> done &

　　查看发生死锁的进程 ID，杀死这个进程，让系统生成它的 core dump 文件：

$ ps aux | grep main
ubuntu    4184  0.0  0.1  31844  1792 pts/0    Sl   03:01   0:00 ./main
$ kill -s SIGSEGV 4184
$ ls /var/crash
!home!ubuntu!main.4184.1514790137.11

　　可以看到，系统生成了core dump 文件，放在了/var/crash目录下，这样就可以使用 gdb 进行分析：

1	$ gdb ./main /var/crash/\!home\!ubuntu\!main.4184.1514790137.11

　　查看进程有哪些线程正在执行：

(gdb) info threads
  Id   Target Id         Frame
* 1    Thread 0x7f109b2e4740 (LWP 4184) 0x00007f109aebd9dd in pthread_join (
    threadid=139709282297600, thread_return=0x0) at pthread_join.c:90
  2    Thread 0x7f1099a49700 (LWP 4186) __lll_lock_wait ()
    at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  3    Thread 0x7f109a24a700 (LWP 4185) __lll_lock_wait ()
    at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135

　　可以看到，线程 2 和线程 3 正在等待互斥锁，可以查看线程 2 的信息：

1
2
3

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f1099a49700 (LWP 4186))]
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135

　　打印线程 2 的堆栈信息，可以看到是哪一行代码正在尝试获取锁：

(gdb) backtrace
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f109aebee0d in __GI___pthread_mutex_lock (mutex=0x6052e0 <m1>)
    at ../nptl/pthread_mutex_lock.c:80
#2  0x0000000000401007 in __gthread_mutex_lock (__mutex=0x6052e0 <m1>)
    at /usr/include/x86_64-linux-gnu/c++/5/bits/gthr-default.h:748
#3  0x0000000000401536 in std::mutex::lock (this=0x6052e0 <m1>)
    at /usr/include/c++/5/mutex:135
#4  0x00000000004015bc in std::lock_guard<std::mutex>::lock_guard (
    this=0x7f1099a48e50, __m=...) at /usr/include/c++/5/mutex:386
#5  0x00000000004011b7 in bar () at main.cpp:18
#6  0x000000000040289f in std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (this=0x153bdb8) at /usr/include/c++/5/functional:1531
#7  0x00000000004027f8 in std::_Bind_simple<void (*())()>::operator()() (
    this=0x153bdb8) at /usr/include/c++/5/functional:1520
#8  0x0000000000402788 in std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (this=0x153bda0) at /usr/include/c++/5/thread:115
#9  0x00007f109abebc80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007f109aebc70a in start_thread (arg=0x7f1099a49700) at pthread_create.c:333
#11 0x00007f109a65a82d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

　　从上面的输出可以发现，线程 2 在main.cpp的第 18 行的位置等待互斥锁。

阅读全文 »

在 Linux 生成 core dump 文件

发表于 2017-12-31 | 分类于性能剖析 | 阅读次数

生成 core dump 文件

　　如果进程在运行期间发生奔溃，操作系统会为进程生成一个快照文件，这个文件就叫做 core dump。之后我们可以对 core dump 文件进行分析，弄清楚进程为什么会奔溃。
　　由于 core dump 文件会占据一定的磁盘空间，默认情况下，Linux 不允许生成 core dump 文件。例如，下面的命令显示，Linux 允许的最大 core dump 文件大小为 0：

1 2	$ ulimit -a \| grep core core file size (blocks, -c) 0

　　可以通过下面设置，允许 Linux 生成 core dump 文件：

1	$ ulimit -c unlimited

　　注意到，这个设置只对当前登录回话有效。如果想要这个设置持久有效，可以把它写入到/etc/security/limits.conf文件中：

1
2
3

$ sudo vi /etc/security/limits.conf
* soft core unlimited
* soft hard unlimited

core dump 文件的路径

　　那么 core dump 会存放在哪个目录呢？这是由系统参数kernel.core_pattern决定的。例如，在 Ubuntu 16.04 中，它的值如下：

1 2	$ cat /proc/sys/kernel/core_pattern \|/usr/share/apport/apport %p %s %c %P

　　开头的I表示，core dump 文件会交给 apport 程序去处理，而 apport 会将 core dump 文件保存在/var/crash目录下。
　　在实践中，更好的做法是自己指定 core dump 目录，以及 core dump 文件的命名方式：

1
2
3

$ sudo vi /etc/sysctl.conf
kernel.core_pattern=/var/crash/%E.%p.%t.%s
$ sudo sysctl -p

　　我们设置 core dump 目录为/var/crash，core dump 的命名方式为%E.%p.%t.%s，它们的含义：

%E：程序文件的完整路径（路径中的/会被!替代）
%p：进程 ID
%t：进程奔溃的时间戳
%s：哪个信号让进程奔溃

阅读全文 »

使用 gdb 调试 C++ 程序

发表于 2017-12-31 | 分类于性能剖析 | 阅读次数

查看调用栈

　　可以使用backtrace命令查看当前函数的调用栈。下面用一个 C++ 程序演示一下：

#include <iostream>
void bar()
{
    std::cout << "I'm bar!" << std::endl;
}
void foo()
{
    std::cout << "I'm foo!" << std::endl;
    bar();
}
int main()
{
    std::cout << "Hello, World!" << std::endl;
    foo();
    return 0;
}

　　编译这个程序，并使用 gdb 调试。我们在代码第 5 行设置断点（在bar()函数里面），运行程序之后，就可以使用backtrace命令查看bar()函数的调用栈：

$ g++ -std=c++11 -g -o hello hello.cpp
$ gdb ./hello -tui
(gdb) break 5
Breakpoint 1 at 0x40084a: file hello.cpp, line 5.
(gdb) run
Starting program: /home/ubuntu/hello
Hello, World!
I'm foo!
Breakpoint 1, bar () at hello.cpp:5
(gdb) backtrace
#0  bar () at hello.cpp:5
#1  0x000000000040088e in foo () at hello.cpp:11
#2  0x00000000004008b6 in main () at hello.cpp:17

调试正在运行的程序

　　gdb 也可以用来调试正在运行的进程，例如下面的 C++ 程序：

#include <iostream>
#include <chrono>
#include <thread>
void sleep()
{
    std::this_thread::sleep_for(std::chrono::seconds(1));
}
int main()
{
    int i = 1000;
    while (true)
    {
        std::cout << "Hello, World!" << std::endl;
        sleep();
        i -= 1;
    }
    return 0;
}

　　编译并运行这个程序：

1
2
3

$ g++ -std=c++11 -g -o sleep sleep.cpp
$ ./sleep &
[1] 5507

　　可以看到进程 ID 是5507，可以使用 gdb 调试这个进程：

$ sudo gdb ./sleep 5507
Attaching to program: /home/ubuntu/sleep, process 5507
0x00007f227c95d740 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:84
(gdb)

　　使用backtrace命令查看调用栈：

(gdb) backtrace
#0  0x00007f227c95d740 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:84
#1  0x0000000000400d6e in std::this_thread::sleep_for<long, std::ratio<1l, 1l> > (
    __rtime=...) at /usr/include/c++/5/thread:292
#2  0x0000000000400943 in sleep () at sleep.cpp:7
#3  0x000000000040098a in main () at sleep.cpp:17

　　在调试的时候，这个进程会暂停执行，当调试结束的时候，进程会恢复执行。

阅读全文 »

使用 Valgrind 检测 C++ 内存泄漏

发表于 2017-12-31 | 分类于性能剖析 | 阅读次数

Valgrind 的介绍

　　Valgrind 可以用来检测程序是否有非法使用内存的问题，例如访问未初始化的内存、访问数组时越界、忘记释放动态内存等问题。在 Linux 可以使用下面的命令安装 Valgrind：

$ wget ftp://sourceware.org/pub/valgrind/valgrind-3.13.0.tar.bz2
$ bzip2 -d valgrind-3.13.0.tar.bz2
$ tar -xf valgrind-3.13.0.tar
$ cd valgrind-3.13.0
$ ./configure && make
$ sudo make install

检测内存泄漏

　　Valgrind 可以用来检测程序在哪个位置发生内存泄漏，例如下面的程序：

#include <stdlib.h>
int main()
{
    int *array = malloc(sizeof(int));
    return 0;
}

　　编译程序时，需要加上-g选项：

1	$ gcc -g -o main_c main.c

　　使用 Valgrind 检测内存使用情况：

$ valgrind --tool=memcheck --leak-check=full  ./main_c
==31416== Memcheck, a memory error detector
==31416== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==31416== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==31416== Command: ./main_c
==31416==
==31416==
==31416== HEAP SUMMARY:
==31416==     in use at exit: 4 bytes in 1 blocks
==31416==   total heap usage: 1 allocs, 0 frees, 4 bytes allocated
==31416==
==31416== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==31416==    at 0x4C2DBF6: malloc (vg_replace_malloc.c:299)
==31416==    by 0x400537: main (main.c:5)
==31416==
==31416== LEAK SUMMARY:
==31416==    definitely lost: 4 bytes in 1 blocks
==31416==    indirectly lost: 0 bytes in 0 blocks
==31416==      possibly lost: 0 bytes in 0 blocks
==31416==    still reachable: 0 bytes in 0 blocks
==31416==         suppressed: 0 bytes in 0 blocks
==31416==
==31416== For counts of detected and suppressed errors, rerun with: -v
==31416== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

　　先看看输出信息中的HEAP SUMMARY，它表示程序在堆上分配内存的情况，其中的1 allocs表示程序分配了 1 次内存，0 frees表示程序释放了 0 次内存，4 bytes allocated表示分配了 4 个字节的内存。
　　另外，Valgrind 也会报告程序是在哪个位置发生内存泄漏。例如，从下面的信息可以看到，程序发生了一次内存泄漏，位置是main.c文件的第 5 行：

1
2
3

==31416== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==31416==    at 0x4C2DBF6: malloc (vg_replace_malloc.c:299)
==31416==    by 0x400537: main (main.c:5)

　　Valgrind 也可以用来检测 C++ 程序的内存泄漏，下面是一个正常的 C++ 程序，没有发生内存泄漏：

#include <string>
int main()
{
    auto ptr = new std::string("Hello, World!");
    delete ptr;
    return 0;
}

　　使用 Valgrind 分析这段程序：

$ valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all ./main_cpp
==31438== Memcheck, a memory error detector
==31438== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==31438== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==31438== Command: ./main_cpp
==31438==
==31438==
==31438== HEAP SUMMARY:
==31438==     in use at exit: 72,704 bytes in 1 blocks
==31438==   total heap usage: 2 allocs, 1 frees, 72,736 bytes allocated
==31438==
==31438== 72,704 bytes in 1 blocks are still reachable in loss record 1 of 1
==31438==    at 0x4C2DBF6: malloc (vg_replace_malloc.c:299)
==31438==    by 0x4EC3EFF: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==31438==    by 0x40104E9: call_init.part.0 (dl-init.c:72)
==31438==    by 0x40105FA: call_init (dl-init.c:30)
==31438==    by 0x40105FA: _dl_init (dl-init.c:120)
==31438==    by 0x4000CF9: ??? (in /lib/x86_64-linux-gnu/ld-2.23.so)
==31438==
==31438== LEAK SUMMARY:
==31438==    definitely lost: 0 bytes in 0 blocks
==31438==    indirectly lost: 0 bytes in 0 blocks
==31438==      possibly lost: 0 bytes in 0 blocks
==31438==    still reachable: 72,704 bytes in 1 blocks
==31438==         suppressed: 0 bytes in 0 blocks
==31438==
==31438== For counts of detected and suppressed errors, rerun with: -v
==31438== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

　　使用 Valgrind 分析 C++ 程序时，有一些问题需要留意。例如，这个程序并没有发生内存泄漏，但是从HEAP SUMMARY可以看到，程序分配了 2 次内存，但却只释放了 1 次内存，为什么会这样呢？
　　实际上这是由于 C++ 在分配内存时，为了提高效率，使用了它自己的内存池。当程序终止时，内存池的内存才会被操作系统回收，所以 Valgrind 会将这部分内存报告为 reachable 的，需要注意，reachable 的内存不代表内存泄漏，例如，从上面的输出中可以看到，有 72704 个字节是 reachable 的，但没有报告内存泄漏。

阅读全文 »