About

About me

Hello, it's me

Hi, I am Xurui Yan. My chinese name is 晏旭瑞. I received a Bachelor's degree in Computer Science and Technology from Beihang University. After that, I worked in Beijing as a developer aimed at high-performance web server. Then, I got my MSc in Informatics in the University of Edinburgh. Now, I am a software engineer in Microsoft.

About this website

This is my personal website and blog where I write down my thoughts, technical experiences, solutions to problems and learning notes that I would like to share. All articles are original unless stated otherwise and every article takes me more than 10 hours.

Technical details behind this website

This is a static site generated by Pelican and hosted on bandwagonhost's VPS using NGINX.

Pelican

Pelican is A Static Site Generator, Powered by Python. It is a command tool which roughly does the following 2 things:

  1. parse markdown or reStructuredText files that you write
  2. generate the HTML output from a specific theme using Jinja2 templates

There are many similar static site generators in other languages: StaticGen. I strongly recommended choosing one according to your preferred language and technical stack.

Deploy

Rsync speedups deployment by just transfering the difference between two sets of files. There are 2 different ways to run rsync: using ssh directly or a rsync daemon. It's more convenient to transfer via ssh without the need to config a rsync server.

rsync -rvc -e 'ssh -p <port>' --delete output/ <user>@<ip>:/www/blog
Let me explain all the options:

  • r, --recursive, copy directories recursively
  • v, --verbose
  • c, --checksum, check if the files have been changed by checksum instead of time of last modification. This is necessary since all files are generated in local at each build.
  • e, --rsh, specify remote shell program and its options.
  • delete, delete files on server side that are not on local side. Without this option, files that are deleted in local side will retain in server.

HTTPS

HTTPS is enabled with free certificate provided by Let's Encrypt.

In order to get a certificate from Let’s Encrypt, you need to use an ACME client. acme.sh is widely used nowadays.

  1. install

    curl https://get.acme.sh | sh -s email=yanxurui1993@qq.com
    
    This will also create a cron job to renew certs automatically.

  2. issue wildcard certs

    export Ali_Key="xxxxx" # My domain is bought from Ali cloud
    export Ali_Secret="xxxxx"
    acme.sh --issue -d "*.yanxurui.cc" -d "yanxurui.cc" --dns dns_ali --renew-hook /var/www/blog/extra/scripts/renew_successfully_renewed_certificate.sh
    

Basically, it just verifies the ownership of the domain.

...
[Sat Dec 3 21:23:53 CST 2022] Your cert is in: /root/.acme.sh/*.yanxurui.cc/*.yanxurui.cc.cer
[Sat Dec 3 21:23:53 CST 2022] Your cert key is in: /root/.acme.sh/*.yanxurui.cc/*.yanxurui.cc.key
[Sat Dec 3 21:23:53 CST 2022] The intermediate CA cert is in: /root/.acme.sh/*.yanxurui.cc/ca.cer
[Sat Dec 3 21:23:53 CST 2022] And the full chain certs is there: /root/.acme.sh/*.yanxurui.cc/fullchain.cer
It is in renew_successfully_renewed_certificate that I synced the cert&key pair to other machines and reload services that are using certs.

  1. install/deploy certificate
    DOMAIN="*.yanxurui.cc"
    CONFIG_ROOT="/etc/nginx/ssl/$DOMAIN"
    acme.sh -d "$DOMAIN" \
    --install-cert \
    --reloadcmd "systemctl reload nginx" \
    --fullchain-file "${CONFIG_ROOT}/fullchain.cer" \
    --key-file "${CONFIG_ROOT}/$DOMAIN.key" \
    --cert-file "${CONFIG_ROOT}/$DOMAIN.cer" \
    

Google

One of the advanced features provided by google search is to narrow your results by specifying the exact site or domain.

For example, if you want to search flv in my site, you can search flv site:https://www.yanxurui.cc in Google or redirect to https://www.google.com/search?q=site:https://www.yanxurui.cc%20flv in a browser.

This is the magic behind the search box in the left sidebar of my site.

Search Console

Google search console allows webmasters to control index and optimize visibility of their websites.

By submitting your sitemap, you will get to know

  • Converage
    • which urls have been indexed successfully and which have been excluded
    • when was a url last crawled
  • Performance
    • daily clicks and top queries/pages in the Google search results
    • which keyword leads to a page click (this is very important to the SEO)
  • Links
    • top linking pages from external sites

sitemap is usually a XML file including all the urls of your website that you want the search engine to index. Most importantly, it indicates when the page might update and the search engine need to reindex at a specific frequency.

What if you don't want a page to be indexed by Google?

  1. use 'noindex' tag place the following meta tag into the <head> section of your page:
    <meta name="robots" content="noindex">
    
  2. use robots.txt
    User-agent: *
    Disallow: /archives.html
    Disallow: /tags.html
    

Analytics

Google analytics is a very powerful tool to track and report your website traffic.

First things first. How can this free service help us? It can tell you

  • your daily or monthly active users
  • which pages are most frequently visited?
  • how you acquire your uses? by organic serach, direct or referral

...just to name a few

You must be eager to try and It's indeed very easy to use.

  1. Create a property which stands for your website under your account on the Analytics platform.
  2. Copy and paste the tracking code(gtag.js) as the first item into the <HEAD> of every webpage you want to track

The ownership of both Search Console and Google Analytics is authorized in this way.

Theme

The frontend design was initially done by Abdel Raoof Olakara for Jekyll and the theme is freely available on Github: olakara/JekyllMetro. I forked it and made a lot of modifications before applying to pelican.

Image Library

PhotoSwipe

A popular image gallery used to zoom and swipe images on both desktop and mobile devices. It works when you click or tap an image in any page.

This is a 500px-like justified image grid layout which can present images in their original aspect ratio without cropping.

Another common layout is Pinterest-like cascading/fluid grid layout also know as masonry style. Examples can be seen in Masonry, Metro, and Justify layout modes

Code Highlight

CodeHilite adds hightlight for code blocks using Pygments. This extension is included in the standard Python Markdown library.

Let me explain it in a detailed way.

Like any Markdown interpreter, Python Markdown can convert

    print('hello')
into
<pre><code>print('hello')</code></pre>

We need two steps to achieve highlighting:

1. set classes for different components in the code block

This is done by codehilite extension

This extension also enables us to specify the language.

The result of

    :::python
    print('hello')
is
<div class="codehilite">
    <pre>
        <span></span><span class="k">print</span><span class="p">(</span><span class="s1">&#39;hello&#39;</span><span class="p">)</span>
    </pre>
</div>

I prefer to fenced code block which is supported on GitHub so we still need markdown.extensions.fenced_code to support this syntax.

markdown.markdownFromFile('test.md', extensions=['markdown.extensions.fenced_code'])
The third party extensions pymdownx.highlight plus pymdownx.superfences can do the same thing.

2. import css rules to define styles for different classes

Pygments can generate CSS stylesheet of different themes for all CSS classes mentioned above.

Install Pygments library and run the following command:

pygmentize -S monokai -f html -a .highlight > styles.css

  • -S specify the theme to be used.
  • -a sets CSS class name for the wrapper <div> tag. Defaults to codehilite.

Then link this CSS file in the head of the HTML file.

Code highlighting can also be done by JavaScript highlighters such as highlight.js instead of CSS stylesheet produced by Pygments. Codehilite will generate HTML snippet like below if use_pygments is turned off.

<pre><code class="python">print('hello')</code></pre>

My tweets shown in the homepage

They are pulled from my GitHub private repo status using JavaScript + GitHub API. I am writing my thoughts here every day.

Math Equations

Mathjax is a great js library to render LaTeX math equations for web. Just include the following line in the markdown file where equations appear.

<script type="text/javascript" async
  src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-MML-AM_CHTML">
</script>

The default math delimiters are $$...$$ and \[...\] for displayed mathematics, and \(...\) for in-line mathematics.

Besides, \begin{align}...\end{align} is also supported.

A few measures should be taken to render equations as expected because the markdown interpreter and the browser interpret the text before MathJax does:

  1. use double backslash (\) obtain a single backslash in the HTML page for mathjax
  2. escape * and _ when needed
  3. avoid any html tag (even <y is not allowed) in the equation

Fortunately, a python markdown extension will handle 1 and 2 properly.


qq email facebook github
© 2024 - Xurui Yan. All rights reserved
Built using pelican