:sleepy: update trie

author ouuan <y___o___u@126.com>

Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)

committer ouuan <y___o___u@126.com>

Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)
author ouuan <y___o___u@126.com>
Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)
committer ouuan <y___o___u@126.com>
Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)
diff --git a/docs/string/trie.md b/docs/string/trie.md

index 0d6e776..2256ee4 100644 (file)
--- a/docs/string/trie.md
+++ b/docs/string/trie.md
@@ -8,11 +8,13 @@
  
  可以发现，这棵字典树用边来代表字母，而从根结点到树上某一结点的路径就代表了一个字符串。举个例子， $1\to4\to 8\to 12$ 表示的就是字符串 `caa` 。
  
-Trie 的结构非常好懂，我们用一个二维数组 $tr[i,j]$ 表示结点 i 的 j 字符指向的下一个结点，或着说是结点 i 代表的字符串后面添加一个字符 j 形成的字符串的结点。（j 的取值和字符集大小有关，不一定是 $0\sim 26$ ）
+Trie 的结构非常好懂，我们用 $\delta(u,c)$ 表示结点 $u$ 的 $c$ 字符指向的下一个结点，或着说是结点 $u$ 代表的字符串后面添加一个字符 $c$ 形成的字符串的结点。（$c$ 的取值范围和字符集大小有关，不一定是 $0\sim 26$ 。）
+
+有时需要标记插入进 Trie 的是哪些字符串，每次插入完成时在这个字符串所代表的节点处打上标记即可。
  
  ## 代码实现
  
-æ\94¾ä¸\80ä¸ªç»\93æ\9e\84ä½\93å°\81è£\85ç\9a\84æ¨¡æ\9d¿ï¼\8cå\8d\81å\88\86å¥½æ\87\82
+æ\94¾ä¸\80ä¸ªç»\93æ\9e\84ä½\93å°\81è£\85ç\9a\84æ¨¡æ\9d¿ï¼\9a
  
  ```cpp
  struct trie {
@@ -40,14 +42,161 @@ struct trie {
  };
  ```
  
-## 在 Trie 上 KMP
+## 应用
+
+### 检索字符串
+
+字典树最基础的应用 —— 查找一个字符串是否在“字典”中出现过。
+
+#### [于是他错误的点名开始了](https://www.luogu.org/problemnew/show/P2580)
+
+对所有名字建 Trie，再在 Trie 中查询字符串是否存在，第一次点名时标记为点过名。
+
+??? note "参考代码"
+    ```cpp
+    #include <cstdio>
+    
+    const int N = 500010;
+    
+    char s[60];
+    int n, m, ch[N][26], tag[N], tot = 1;
+    
+    int main()
+    {
+      scanf("%d", &n);
+      
+      for (int i = 1; i <= n; ++i)
+      {
+        scanf("%s", s + 1);
+        int u = 1;
+        for (int j = 1; s[j]; ++j)
+        {
+          int c = s[j] - 'a';
+          if (!ch[u][c]) ch[u][c] = ++tot;
+          u = ch[u][c];
+        }
+        tag[u] = 1;
+      }
+      
+      scanf("%d", &m);
+      
+      while (m--)
+      {
+        scanf("%s", s + 1);
+        int u = 1;
+        for (int j = 1; s[j]; ++j)
+        {
+          int c = s[j] - 'a';
+          u = ch[u][c];
+          if (!u) break; // 不存在对应字符的出边说明名字不存在
+        }
+        if (tag[u] == 1)
+        {
+          tag[u] = 2;
+          puts("OK");
+        }
+        else if (tag[u] == 2) puts("REPEAT");
+        else puts("WRONG");
+      }
+      
+      return 0;
+    }
+    ```
+
+### AC 自动机
+
+Trie 是 [AC 自动机](./ac-automaton.md) 的一部分。
  
-实际上要做的事情是求出 Trie 的每个节点的 $next$ 值。
+### 异或相关
  
-å½\93ç\84¶ï¼\8cè¿\99é\87\8cç\9a\84 $next$ ä¸\8då\86\8dæ\98¯ä¸\80ä¸ªå\80¼ï¼\8cè\80\8cæ\98¯ç\9b¸å½\93äº\8eæ\98¯ä¸\80ä¸ªæ\8c\87é\92\88â\80\94â\80\94å®\83å\8f¯è\83½æ\8c\87å\90\91å\85¶ä»\96å\88\86æ\94¯ç\9a\84è\8a\82ç\82¹。
+å°\86æ\95°ç\9a\84äº\8cè¿\9bå\88¶è¡¨ç¤ºç\9c\8bå\81\9aä¸\80ä¸ªå\97ç¬¦ä¸²ï¼\8cå°±å\8f¯ä»¥å»ºå\87ºå\97ç¬¦é\9b\86ä¸º $\{0,1\}$ ç\9a\84 Trie æ \91。
  
-这时 $next$ 的定义：最长的等于同长度的后缀的从根开始的路径的长度。
+#### [BZOJ1954 最长异或路径](https://www.luogu.org/problem/P4551)
+随便指定一个根 $root$，用 $T(u, v)$ 表示 $u$ 和 $v$ 之间的路径的边权异或和，那么 $T(u,v)=T(root, u)\oplus T(root,v)$，因为 [LCA](../graph/lca.md) 以上的部分异或两次抵消了。
+
+那么，如果将所有 $T(root, u)$ 插入到一棵 Trie 中，就可以对每个 $T(root, u)$ 快速求出和它异或和最大的 $T(root, v)$：
+
+从 Trie 的根开始，如果能向和 $T(root, u)$ 的当前位不同的子树走，就向那边走，否则没有选择。。
+
+贪心的正确性：如果这么走，这一位为 $1$；如果不这么走，这一位就会为 $0$。而高位是需要优先尽量大的。
+
+??? note "参考代码"
+    ```cpp
+    #include <cstdio>
+    #include <algorithm>
+    
+    const int N = 100010;
+    
+    int head[N], nxt[N << 1], to[N << 1], weight[N << 1], cnt;
+    int n, dis[N], ch[N << 5][2], tot = 1, ans;
+    
+    void insert(int x)
+    {
+      for (int i = 30, u = 1; i >= 0; --i)
+      {
+        int c = ((x >> i) & 1);
+        if (!ch[u][c]) ch[u][c] = ++tot;
+        u = ch[u][c];
+      }
+    }
+    
+    void get(int x)
+    {
+      int res = 0;
+      for (int i = 30, u = 1; i >= 0; --i)
+      {
+        int c = ((x >> i) & 1);
+        if (ch[u][c ^ 1])
+        {
+          u = ch[u][c ^ 1];
+          res |= (1 << i);
+        }
+        else u = ch[u][c];
+      }
+      ans = std::max(ans, res);
+    }
+    
+    void add(int u, int v, int w)
+    {
+      nxt[++cnt] = head[u];
+      head[u] = cnt;
+      to[cnt] = v;
+      weight[cnt] = w;
+    }
+    
+    void dfs(int u, int fa)
+    {
+      insert(dis[u]);
+      get(dis[u]);
+      for (int i = head[u]; i; i = nxt[i])
+      {
+        int v = to[i];
+        if (v == fa) continue;
+        dis[v] = dis[u] ^ weight[i];
+        dfs(v, u);
+      }
+    }
+    
+    int main()
+    {
+      scanf("%d", &n);
+      
+      for (int i = 1; i < n; ++i)
+      {
+        int u, v, w;
+        scanf("%d%d%d", &u, &v, &w);
+        add(u, v, w);
+        add(v, u, w);
+      }
+      
+      dfs(1, 0);
+      
+      printf("%d", ans);
+      
+      return 0;
+    }
+    ```
  
-求法跟 [KMP](/string/kmp/#knuth-morris-pratt) 中的一样，只是要改成在 Trie 上 [BFS](/search/bfs) 。
+### 可持久化字典树
  
-复杂度：均摊分析失效了，其实只能在每条链上均摊分析，于是总复杂度为模式串长总和。
+参见 [可持久化字典树](../ds/persistent-trie.md) 。
+\ No newline at end of file
author	ouuan <y___o___u@126.com>
	Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)
committer	ouuan <y___o___u@126.com>
	Fri, 13 Sep 2019 17:01:08 +0000 (01:01 +0800)