Musings of a Software Inventor, or, THE LACK THEREOF

CS512 - Survey Paper - Draft

% Click here to regenerate PDF, and here is the resulting pdf

\begin{abstract}
Software Repositories keep the complete history of any file in the project, including who modified what, when, and the delta of the modification. They can be used to mine important information like how the project evolved, how the developers collaborated, their contributions, important milestones in the project's development, predict changes in a software system, bring some insight on the design of the software, warn about the change couplings that need to be propagated to the other entities help during program comprehension in general and enlighten dependencies between parts of a system. These applications of data mining to software repository inspire us to summarize current approaches and recent developments.
\end{abstract}

\section*{Introduction}
Software Repositories keep the complete history of any file in the project, including who modified what, when, and the delta of the modification. They can be used to mine important information like how the project evolved, how the developers collaborated, their contributions, important milestones in the project's development, predict changes in a software system, bring some insight on the design of the software, warn about the change couplings that need to be propagated to the other entities help during program comprehension in general and enlighten dependencies between parts of a system. These applications of data mining to software repository inspire us to summarize current approaches and recent developments.

We will build off of a previous survey, [1], and for each of the major topics discussed we will include some recent developments. We will also consider whether the list of topics needs to be revised, with groups added, removed, or merged. Topics, along with some of the papers we will read, include:

\begin{itemize}
\item Metadata analysis
\item Static source code analysis
\item Source code differencing and analysis
\item Software metrics
\item Visualization
\item Clone detection
\item Frequent-pattern mining
\item Information-retreival methods
\item Classification with supervised learning
\item Social network analsysis
\end{itemize}

\section*{Metadata analysis}

\section*{Static source code analysis}

\section*{Source code differencing and analysis}

\section*{Software metrics}

\section*{Visualization}

\section*{Clone detection}

In Clone Detection, the goal is to analyze source code to look for pieces of code that are duplicated. These may be caused by copy-paste...

The most recent approaches to this are...

\section*{Frequent-pattern mining}

\section*{Information-retreival methods}

\section*{Classification with supervised learning}

\section*{Social network analysis}

\section*{Conclusion}

Here we sum everything up.