11 *Definition of birthmark
\r
13 Let <p, q> be programs and <f(p)> be a set of characteristics
\r
14 extracted from <p> by a certain method <f>. Then <f(p)> is called a
\r
15 <<birthmark>> of <p> iff both of the following conditions are
\r
18 [[1]] <f(p)> is obtained from <p> itself without any extra information.
\r
20 [[2]] If <q> is a copy of <p>, then <f(p) = f(q)>
\r
24 Condition 1 means that the birthmark is not extra information and is
\r
25 required for <p> to run. Hence, extracting a birthmark does not
\r
26 require extra code as watermarking does. Condition 2 states that the
\r
27 same birthmark has to be obtained from copied programs. By
\r
28 contraposition, if birthmarks <f(p)> and <f(q)> are different, then we
\r
29 can guarantee that <q> is not a copy of <p>.
\r
31 Hopefully, a birthmark will satisfy the following properties.
\r
33 [Property 1 (Preservation)] For <p'> obtained from $p$ by any program
\r
34 transformation, <f(p) = f(p')> holds.
\r
36 [Property 2 (Distinction)] For <p> and <q> such that both programs have
\r
37 same specification, if <p> and <q> are written
\r
38 independently, then <f(p) != f(q)>.
\r
42 These properties strengthen Condition 2 of birthmark definition.
\r
43 Property 1 specifies the <<preservation property>> of the birthmark
\r
44 against program transformation. We believe that clever crackers may
\r
45 try to modify birthmarks by transforming the original program into an
\r
46 equivalent one to hide the fact of theft. There are several automated
\r
47 tools used to perform the transformation, involving program
\r
48 <obfuscators> and <optimizers>. These tools can be used as a means of
\r
49 attack against the birthmarks. Property 1 specifies that the same
\r
50 birthmark must be obtained from <p> and converted to <p'>. However,
\r
51 there exist many ways to transform a program into an equivalent
\r
52 one. Hence, in reality, it is difficult to extract strong enough
\r
53 birthmarks to perfectly satisfy Property 1.
\r
55 Property 2 specifies the <<distinction property>> of the birthmark,
\r
56 stating that: even though the specification of <p> and <q> is the
\r
57 same, if implemented separately, different birthmarks should be
\r
58 extracted. In general, the detail of two independent programs is
\r
59 almost never completely the same. However, in the case that <p> and
\r
60 <q> are both <<tiny>> programs, extracted birthmarks could become the
\r
61 same, even if <p> and <q> are written independently. Those properties
\r
62 should be tuned within an allowable range at the user's discretion.
\r
64 Implemented Birthmarks
\r
68 [Proposers] H. Tamada, et al.
\r
70 [Paper] Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
71 Matsumoto, "Java Birthmarks --Detecting the Software Theft--," IEICE
\r
72 Transactions on Information and Systems, Vol. E88-D, No. 9, September
\r
77 [Proposers] H. Tamada, et al.
\r
79 [Paper] Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
80 Matsumoto, "Java Birthmarks --Detecting the Software Theft--," IEICE
\r
81 Transactions on Information and Systems, Vol. E88-D, No. 9, September
\r
86 [Proposers] H. Tamada, et al.
\r
88 [Paper] Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
89 Matsumoto, "Java Birthmarks --Detecting the Software Theft--," IEICE
\r
90 Transactions on Information and Systems, Vol. E88-D, No. 9, September
\r
95 [Proposers] H. Tamada, et al.
\r
97 [Paper] Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
98 Matsumoto, "Java Birthmarks --Detecting the Software Theft--," IEICE
\r
99 Transactions on Information and Systems, Vol. E88-D, No. 9, September
\r
102 *k-gram based birthmark
\r
104 [Proposers] G. Myles and C. Collberg
\r
106 [Paper] Ginger Myles, Christian Collberg, ``K-gram based software
\r
107 birthmarks,'' In Proc. of the 2005 ACM symposium on Applied
\r
111 Related Publications
\r
113 * Haruaki Tamada, Yuichiro Kanzaki, Masahide Nakamura, Akito Monden,
\r
114 Ken-ichi Matsumoto, "A method for extracting program fingerprints
\r
115 from Java class files," The Institute of Electronics, Information
\r
116 and Communication Engineers Technical Report, Vol. ISEC2003-29,
\r
117 pp.127-133, July 2003. (in Japanese)
\r
119 * Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
120 Matsumoto, "Detecting the theft of programs using birthmarks,"
\r
121 Information Science Technical Report, NAIST-IS-TR2003014, ISSN
\r
122 0919-9527, Graduate School of Information Science, Nara Institute
\r
123 of Science and Technology, November 2003.
\r
125 * Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
126 Matsumoto, "Design and evaluation of birthmarks for detecting theft
\r
127 of Java programs," Proc. IASTED International Conference on
\r
128 Software Engineering
\r
129 ({{{http://www.iasted.org/conferences/2004/Innsbruck/se.htm}IASTED
\r
130 SE 2004}}), pp.569-575, Innsbruck,
\r
131 Austria, 17-19 February 2004.
\r
133 * Masateru Tsunoda, Takeshi Kakimoto, Naoki Ohsugi, Akito Monden, and
\r
134 Ken-ichi Matsumoto, "Javawock: A Java Class Recommender System
\r
135 Based on Collaborative Filtering," In Proc. of 17th International
\r
136 Conference on Software Engineering and Knowledge Engineering
\r
137 ({{{http://www.ksi.edu/seke/seke05.html}SEKE2005}}), pp.491-497,
\r
138 July 2005. (Taipei, Taiwan)
\r
140 * Haruaki Tamada, Masahide Nakamura, Akito Monden, and Ken-ichi
\r
141 Matsumoto, "Java Birthmarks --Detecting the Software Theft--,"
\r
142 IEICE Transactions on Information and Systems, Vol. E88-D, No. 9,
\r
145 * Takesi Kakimoto, Akito Monden, Yasutaka Kamei, Haruaki Tamada,
\r
146 Masateru Tsunoda, and Ken-ichi Matsumoto, "Using Software
\r
147 Birthmarks to Identify Similar Classes and Major Functionalities,"
\r
148 In Proc. the 3rd International Workshop on Mining Software
\r
149 Repositories ({{{http://msr.uwaterloo.ca/msr2006/}MSR Mining
\r
150 Challenge 2006}}), pp.171--172, Shanghai, China, May
\r