Do1e

Do1e

github
email

反代以避免Microsoft Clarity被广告规则屏蔽

此文由 Mix Space 同步更新至 xLog
为获得最佳浏览体验,建议访问原始链接
https://www.do1e.cn/posts/code/avoid-clarity-blocked


前言#

Microsoft Clarity 能够对访问网站的用户行为进行分析,此网站也接入了。
个人感觉挺好用的,没事的时候就可以看看网站的访问情况。
使用也非常简单,只要在<head>标签里添加后台提供的一行代码即可。

image

不过很多广告规则都会屏蔽 Clarity 的域名,因此会缺不少用户连接信息。但可以通过 nginx 反代到自己的域名上避免,这里给出具体操作步骤供大家参考。

分析请求#

下面是一个在<head>标签中添加的代码示例。

<script type="text/javascript">
    (function(c,l,a,r,i,t,y){
        c[a]=c[a]||function(){(c[a].q=c[a].q||[]).push(arguments)};
        t=l.createElement(r);t.async=1;t.src="https://www.clarity.ms/tag/"+i;
        y=l.getElementsByTagName(r)[0];y.parentNode.insertBefore(t,y);
    })(window, document, "clarity", "script", "abcdefg");
</script>

或者打开浏览器的开发者工具进行抓包,能看到相关代码会首先请求 https://www.clarity.ms/tag/abcdefg ,下载下来后是一个 js 脚本:

(function (c, l, a, r, i, t, y) {
    function sync() {
        (new Image()).src = "https://c.clarity.ms/c.gif";
    }

    if ("complete" == document.readyState) {
        sync();
    } else {
        window.addEventListener("load", sync);
    }

    if (a[c].v || a[c].t) {
        return a[c]("event", c, "dup." + i.projectId);
    }

    a[c].t = true;
    t = l.createElement(r);
    t.async = true;
    t.src = "https://www.clarity.ms/s/0.8.13-beta/clarity.js";

    y = l.getElementsByTagName(r)[0];
    y.parentNode.insertBefore(t, y);

    a[c]("start", i);
    a[c].q.unshift(a[c].q.pop());
    a[c]("set", "C_IS", "0");
})(
    "clarity",
    document,
    window,
    "script",
    {
        projectId: "abcdefg",
        upload: "https://z.clarity.ms/collect",
        expire: 365,
        cookies: ["_uetmsclkid", "_uetvid"],
        track: true,
        content: true,
        unmask: ["body"],
        dob: 2002
    }
);

其中有 3 个链接:

  1. https://c.clarity.ms/c.gif
  2. https://www.clarity.ms/s/0.8.13-beta/clarity.js
  3. https://z.clarity.ms/collect

需要将上述链接一一替换。

替换链接并反代#

假设你的域名为 example.com ,可以新建一个子域名 clarity.example.com 并解析到你公网的一个 nginx 服务器上。

此时可以将 https://www.clarity.ms/tag/abcdefg 请求到的脚本保存到 /var/www/html/tag/abcdefg

mkdir -p /var/www/html/tag
wget https://www.clarity.ms/tag/abcdefg -O /var/www/html/tag/abcdefg

之后将上述 3 个链接依次替换如下:

  1. https://clarity.example.com/c.gif
  2. https://clarity.example.com/s/0.8.13-beta/clarity.js
  3. https://clarity.example.com/collect

之后还需下载 https://www.clarity.ms/s/0.8.13-beta/clarity.js/var/www/html/s/0.8.13-beta/clarity.js,也许你看的时候版本已经不同了,对应地进行修改即可。上述/var/www/html目录也可自行修改,但需要注意权限避免 nginx 进程无法访问,此时目录结构如下:

> tree /var/www/html
/var/www/html
├── clarity.js
├── s
│   ├── 0.8.13-beta
│   │   └── clarity.js
│   └── 0.8.9
│       └── clarity.js
└── tag
    ├── abcdefg
    └── hijklmn

5 directories, 5 files

nginx 反代配置参考:

server {
    listen 80;
    listen [::]:80;
    listen 443 ssl;
    listen [::]:443 ssl;
    server_name clarity.example.com;

    if ($scheme = http) {
        return 301 https://$host$request_uri;
    }
    location /c.gif {
        proxy_pass https://c.clarity.ms$request_uri;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
    location /collect {
        proxy_pass https://z.clarity.ms$request_uri;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
    root /var/www/html;
}

重新配置网站#

将原来在网站 <head> 中添加的脚本中的 www.clarity.ms 替换为 clarity.example.com。至此大功告成,尝试访问你的网站并看看后台是否有在线的实时用户吧!

缺点#

  1. 这样之后缺点就是后台显示所有访问的地理位置均来自于该 nginx 服务器了,如果微软能给就请求头判断地理位置就好了。
  2. 无法自动跟随微软升级 clarity.js,不过也可以自己写一个脚本实现自动,参考代码:
import os
import re
import requests
from dotenv import load_dotenv

load_dotenv()

PROJECT_ID = os.getenv("PROJECT_ID")
BASE_DIR = os.getenv("BASE_DIR", "/var/www/html/clarity")
CUSTOM_DOMAIN = os.getenv("CUSTOM_DOMAIN")
NGINX_CONF = os.getenv("NGINX_CONF", "/etc/nginx/conf.d/clarity.conf")
if not PROJECT_ID:
    raise ValueError("PROJECT_ID is not set in the environment variables.")
if not CUSTOM_DOMAIN:
    raise ValueError("CUSTOM_DOMAIN is not set in the environment variables.")

def get_index(try_times = 5):
    resp = requests.get(f'https://www.clarity.ms/tag/{PROJECT_ID}')
    if resp.status_code != 200:
        if try_times > 0:
            return get_index(try_times - 1)
        raise RuntimeError(f"Failed to fetch data from Clarity. Status code: {resp.status_code}\n{resp.text}")
    return resp

def get_script(url, try_times = 5):
    resp = requests.get(url)
    if resp.status_code != 200:
        if try_times > 0:
            return get_script(url, try_times - 1)
        raise RuntimeError(f"Failed to fetch Clarity script. Status code: {resp.status_code}\n{resp.text}")
    return resp

resp = get_index()
script_content = resp.text

clarity_js_url = re.search(r'https://www.clarity.ms/s/[^"]+\.js', script_content)
if not clarity_js_url:
    raise RuntimeError("Clarity script URL not found in the response.")
clarity_js_url = clarity_js_url.group(0)
upload_url = re.search(r'"upload":"([^"]+)"', script_content)
if not upload_url:
    raise RuntimeError("Upload URL not found in the response.")
upload_url = upload_url.group(1)

clarity_js_version = clarity_js_url.split('/')[-2]
clarity_js_path = os.path.join(BASE_DIR, 's', clarity_js_version, 'clarity.js')
if not os.path.exists(clarity_js_path):
    resp = get_script(clarity_js_url)
    os.makedirs(os.path.dirname(clarity_js_path), exist_ok=True)
    with open(clarity_js_path, 'wb') as f:
        f.write(resp.content)

with open(NGINX_CONF, 'w') as f:
    f.write("""server {
    listen 80;
    listen [::]:80;
    listen 443 ssl;
    listen [::]:443 ssl;
    server_name %s;
    if ($scheme = http) {
        return 301 https://$host$request_uri;
    }
    location /c.gif {
        proxy_pass https://c.clarity.ms$request_uri;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
    location /collect {
        proxy_pass https://%s$request_uri;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
    root %s;
}
""" % (CUSTOM_DOMAIN, upload_url.split('/')[2], BASE_DIR))

os.system('nginx -s reload')

new_script_content = script_content.replace(
    'www.clarity.ms',
    CUSTOM_DOMAIN
).replace(
    upload_url.split('/')[2],
    CUSTOM_DOMAIN
).replace(
    'c.clarity.ms',
    CUSTOM_DOMAIN
)

with open(os.path.join(BASE_DIR, 'tag', PROJECT_ID), 'w') as f:
    f.write(new_script_content)
加载中...
此文章数据所有权由区块链加密技术和智能合约保障仅归创作者所有。