1. Introduction
In today’s internet-driven world, data is growing at an unprecedented rate, generating vast amounts of information daily. Due to the limitations of human computational abilities, sifting through this massive data to find crucial information has become exceedingly challenging. Therefore, recommender systems play a crucial role in assisting users in identifying relevant information. Major companies such as Amazon, Google, Yahoo, and others have already adopted various recommender system methods to understand user preferences and provide personalized recommendations for information of interest. However, in real-world scenarios, the continuous influx of new users and items poses a challenge for recommender systems. The lack of sufficient historical user behavior data in these cases often leads to the cold-start problem [1], where personalized recommendations become difficult to generate.
The goal of recommender systems is to assist users in discovering relevant items from a vast selection, thereby enhancing user satisfaction and engagement. However, traditional recommender methods primarily employ one-hot encoding techniques [2], creating binary vectors to represent user and item identities. Although these methods are simple and practical, they only predict interactions between existing users and items, failing to explicitly capture the underlying relationships between user preferences and item attributes. Recent research [3] has proposed utilizing additional information, such as user demographic attributes and item features, to reveal hidden information between user preferences and item characteristics by integrating detailed user information and item features. This approach enables recommender systems to better comprehend user interests and expectations. Unlike traditional methods, this approach enhances recommender systems by addressing their limitations.
In order to overcome the cold-start problem, it is intuitive to integrate data and knowledge from different domains. By leveraging the correlations and associations between various domains, there are many ways to obtain and apply useful knowledge and data from multiple sources. This approach is known as cross-domain recommender [4], as illustrated in Fig. 1. In the real-world scenario, there often exist overlapping users across multiple domains, and the interaction information of overlapping users is considered as auxiliary information for the current domain. Transfer learning [5] plays an effective role in addressing cold-start problem across domains. By transferring user behavior data and recommender models from one domain to another, it facilitates the transfer of existing user behavior data and recommender models. The target domain can benefit from personalized recommendations using knowledge from the source domain. However, most efforts focus on constructing complex neural networks, overlooking the significance of model optimization in cold-start scenarios.
Fig. 1 Schematic of a cross-domain recommender system contains two domains. We learn user preferences through user interaction items in both domains. |
In recent research on meta-learning-based recommender systems [6], researchers have utilized meta-learning techniques to tackle the cold-start challenge from optimization standpoint. In these approaches, meta-learning methods [7] enable quick adaptation to new users’ interests by sharing knowledge across multiple users, thus providing personalized recommendations. This method demonstrates effectiveness in enhancing recommender systems’ performance, especially in few-shot learning and cold-start situations. By learning user preferences from historical user behavior data, the method requires only a small amount of interaction information for accurate recommendations upon the arrival of new users into the system. However, the current majority of meta-learning-based recommender system approaches [8] tend to focus on building meta-learning frameworks to capture different user preferences and often use multilayer perceptions as the basic model. This approach can lead to the model neglecting the learning of cross-domain information and may reduce the model’s generalization across different tasks.
Therefore, there is a need for a model that fully leverages sample data and cross-domain information while introducing optimization strategies for the model. Under these circumstances, we introduce TECDR, a cross-domain recommender system that leverages domain knowledge transferor (DKT) and latent preference extractor (LPE). TECDR addresses the cold-start challenge in cross-domain recommender scenarios. DKT facilitates the transfer of knowledge between two domains, while LPE is responsible for extracting latent preference information from user and item features. Finally, we employ meta-learning-based model optimization methods to enhance the usefulness of TECDR. The contributions of this paper are as follows:
- We propose novel cross-domain recommender system framework, which leverages domain knowledge transferor and latent preference extractor, called TECDR in order to address the cold-start challenge. This approach is more suitable for practical needs from the perspectives of generalizability, interpretability and diversity.
- We propose the LPE, it is specialized in extracting users’ latent preferences, and can map user and item features from various domains to a shared latent preference space, ensuring effective feature alignment.
- We propose the DKT, it is responsible to migrate knowledge and experience between two domains thus enabling knowledge sharing.
- We conduct large experiments on three cross-domain scenarios in order to confirm the effectiveness of TECDR for cross-domain cold-start challenge.
The remainder of this work is structured below. We present a short review of relevant works in Sect. 2. We give the details of our recommender system model in Sect. 3. Section 4 summarizes our experiments and findings. Lastly, in Sect. 5, we conclude by discussing future research directions.
2. Related Work
2.1 Cold Start Recommender System
Collaborative filtering [9] has achieved significant success in recommender systems. However, such cold-start challenge occurs frequently when making recommendations to new users or items because of data sparsity and insufficient samples. Traditional solutions for cold-start challenge often rely on data augmentation strategies, such as using demographic information like gender, age, or adding interest tags. In recent research, Jain et al. [10] proposed enhancing the cognitive capabilities of recommender systems in cold-start scenarios by incorporating cognitive features such as publication year, type, and age. Anwar et al. [11] enriched target domain data by using context and time data as additional information for recommended items. These approaches, unlike content-based recommender systems, simulate the link between topic data as well as helpful embeddings, allowing the model to learn similar representations from topic information to estimate helpful embeddings. They achieve this through the creation of a number of constraint functions, including local geometric similarity and mean squared error. Local geometric similarity constraints focus the recommender results on similar items, providing more relevant and stronger recommendations. Mean squared error constraints help the recommender system more accurately fit user behavior data during the training process. Furthermore, Barkan et al. [12] introduced the CB2CF model to bridge the gap between content-based recommender system algorithm configurations and collaborative filtering recommender system algorithm representations. Li et al. [13] proposed a low-rank linear autoencoder to map user behavior into user attributes. The symmetric encoder reconstructs user behavior based on user attributes and utilizes mean squared error as a constraint function. However, the local geometric similarity constraint may lead to overfitting, causing the model to overly rely on the features of a few similar items and neglect other more diverse interests. Mean squared error constraint functions often require abundant user behavior data, which is often sparse in cold-start scenarios where user and item data are limited. In contrast, transfer learning addresses the cold-start problem by sharing knowledge and experience between different domains or tasks, enabling the transfer of data between two domains. Li et al. [14] proposed a cross-domain recommender system that iteratively transfers information across two associated areas and depends upon an additive learning approach. To obtain user preferences across several domains, they used latent orthogonal mappings. Xu et al. [15] applied transfer learning methods to deep Q networks, significantly reducing the training time for deep reinforcement learning. In this context, we have designed a novel domain knowledge transferor to transfer knowledge and patterns. Compared with existing methods, the novelty of our approach lies in the shared user embeddings across two domains. This allows the domains to collaboratively learn feature representations of users, thereby enhancing the model’s ability to represent and understand user features, ultimately improving the model’s generalization. Simultaneously, through the sharing of user embeddings, users and items from the source and target domains are mapped to a common feature space, facilitating cross-domain knowledge transfer and mitigating the adverse effects of domain differences on model performance.
2.2 Cross Domain Recommender System
Cross-domain recommender integrate and utilize information from multiple domains of users and items to provide personalized recommendations across domains. By analyzing user behavior data in different domains and identifying similarities between users and items, transferring data and knowledge from one or more source domains to the target domain can successfully handle sparsity of data and cold-start challenges. The success of cross-domain recommender system lies in knowledge sharing, where the model needs to capture relationships between different domains. Transfer learning is an efficient way for dealing with the cold-start challenge with cross-domain recommender systems. Based on transfer strategy, existing cross-domain recommender systems can be categorized into two types: content-based transfer learning cross-domain recommender systems [16] and feature-based transfer learning cross-domain recommender systems [17]. In recent research, Zhu et al. [18] created user and item embeddings using ratings and multi-source information and created an adaptive embedding sharing technique based on multi-task learning to cross-domain combine with one another embeddings of common users. Ma et al. [19] considered both user behavior flow and knowledge flow by combining behavior transfer units and knowledge transfer units, where the mixed and information flow network determines the timing of utilizing cross-domain information. Xie et al. [20] employed a diversified preference network to capture multiple pieces of information reflecting users’ different interests and designed domain-specific and domain-agnostic contrastive learning tasks to facilitate the knowledge transfer process. Di et al. [21] applied attention processes to locate transferable traits that aid knowledge transfer and used a meta-recommender module to discover tailored preferences for cold-start users. Sahu et al. [22] utilized user profiles to transfer knowledge between domains. They constructed user profiles using demographic information and content information on how users rated items. Then, in two domains, they employed maximum a posteriori probability to discover latent components for people and things. Zhao et al. [23] built a cross-domain heterogeneous graph and redefined the message-passing mechanism of graph convolutional networks to identify high-order similarities between cross-domains. Liang et al. [24] enriched the target domain’s individual user group using source domain data and designed a hierarchical attention network to learn and model individual and group preferences. Guan et al. [25] identified the cold-start challenge facing cross-domain scenarios to be a small-sample issue and addressed it using cross-domain knowledge as well as optimization models. This work is also highly relevant to our research.
Nevertheless, the majority of contemporary transfer learning-based cross-domain recommender system methods emphasize creating complicated cross-domain models to enable more effective knowledge sharing across domains, often overlooking the more efficient extraction of users’ latent preferences. In this paper, we propose TECDR, which emphasizes the extraction of users’ latent preferences while leveraging transfer learning methods to deal with the cold-start challenge with cross-domain recommender scenarios.
3. Method
In this part, we begin by describing how TECDR addresses the cross-domain cold-start problem. Then, we present detailed descriptions of the two modules we propose, namely the Latent Preference Extractor (LPE) and the Domain Knowledge Transferor (DKT).
3.1 Problem Formulation
Let’s define the notation for the TECDR approach in the context of two different domains: the source domain denoted as \(S\) and the target domain denoted as \(T\). Both domains contain user information, item information, and interaction records, but their data sparsity levels differ. The item feature set in the source domain is represented as \(F_{S}\), and in the target domain as \(F_{T}\). The set of overlapping user features between the two domains is denoted as \(D\). We represent users in the domains as \(m\) and items as \(n\). In the target domain, historical user behavior data is often sparse, leading to the cold-start challenge. TECDR’s goal aims to leverage the overlapping domains of users and items, along with user-item interaction data to train a model that predicts user ratings for things. This can be represented as follows:
\[\begin{align} & \hat{y}_{S}=\sigma (m,n_{S};\varphi) \tag{1} \\ & \hat{y}_{T}=\sigma (m,n_{T};\varphi) \tag{2} \end{align}\] |
In the provided notation, \(\sigma()\) represents the rating prediction model, and \(\varphi\) represents the parameters of the rating prediction model \(\sigma()\). \(\hat{y}_{S}\) and \(\hat{y}_{T}\) represent the predicted ratings, \(n_{S}\) is denoted as source domain item and \(n_{T}\) is denoted as target domain item.
Figure 2 illustrates the overall workflow of TECDR. In the TECDR model, we employ the Latent Preference Extractor (LPE) to uncover users’ latent preferences and utilize the Domain Knowledge Transferor (DKT) to achieve cross-domain knowledge transfer. Finally, we enhance the framework using meta-learning strategies. The Latent Preference Extractor will be introduced in the following part.
3.2 Latent Preference Extractor
In this section, we will introduce the Latent Preference Extractor (LPE). Traditional recommender system methods use one-hot vectors to encode user and item IDs, where each ID is encoded as a unique vector with only one position as 1 and the rest as 0. This representation leads to a large amount of redundant information, and there is no actual association between user and item IDs, making it challenging to learn better feature representations from the training data. Moreover, one-hot vectors can only represent user and item IDs present in the training set, causing problems with unknown IDs during the testing phase, particularly for new users and items. Therefore, we propose the LPE to help TECDR more effectively uncover users’ latent preferences and reveal potential patterns and associations in the data. This enhances the model’s understanding of the data, improves its generalization ability, and increases prediction accuracy. The LPE mainly involves preprocessing and feature engineering of the raw data to facilitate better extraction of latent preference information. Specifically, we represent user and item features as integers, which help the model capture users’ latent preferences. Then, we transform feature information into one-hot vectors. Finally, to reduce the feature dimension, eliminate redundancy, and improve the algorithm’s runtime, we employ dimension compression techniques. It can be expressed as follows:
\[\begin{align} & E_{m}=g(m,\varphi_{m})=(m_{1}c_{1}\oplus m_{2}c_{2}\oplus \cdots \oplus m_{N}c_{N})^{T}\!\! \tag{3} \\ & E_{n}=g(n,\varphi_{n})=(n_{1}v_{1}\oplus n_{2}v_{2}\oplus \cdots \oplus n_{M}v_{M})^{T} \tag{4} \end{align}\] |
In the provided notation, \(g()\) represents the function for extracting the latent preference of user \(m\) and item \(n\). \(\varphi_{m}\) and \(\varphi_{n}\) denote the parameters of the latent preference extraction functions for users and items, respectively. \(c_{N}\) and \(v_{M}\) represent the embedding matrices for user features corresponding to user \(m\) and item features corresponding to item \(n\), respectively. \(m_{N}\) is a one-hot vector corresponding to user information regarding features, while \(n_{N}\) is a one-hot vector that demonstrates item characteristic information. \(E_{m}\) and \(E_{n}\) represent the embedding vectors for user \(m\) and item \(n\), respectively. The symbol \(\oplus\) denotes the concatenation operation.
3.3 Domain Knowledge Transferor
In this section, we will introduce the Domain Knowledge Transferor (DKT), the Domain Knowledge Transferor is a universal transfer tool established for all users. To facilitate knowledge transfer, we first share user embeddings between the two domains. By sharing user embeddings, both domains can jointly learn the feature representation of users. This indicates that user features learned in one domain are able to directly implemented in the other domain, hence improving user feature visualization and interpretation. The input vectors are represented as follows:
\[\begin{align} & A_{S}=[g(m;\varphi_{m})\oplus g(n_{S};\varphi_{n_{S}})]^{T} \tag{5} \\ & A_{T}=[g(m;\varphi_{m})\oplus g(n_{T};\varphi_{n_{T}})]^{T} \tag{6} \end{align}\] |
With the provided notation, \(g()\) represents the function for extracting the latent preference of user \(m\) and items \(n\). \(\varphi_{m}\), \(\varphi_{n_{S}}\), and \(\varphi_{n_{T}}\) are represented as the parameters of the latent preference extraction functions. The source domain input vector is epresented as \(A_{S}\), the target domain input vector is represented as \(A_{T}\).
The Domain Knowledge Transferor (DKT) enhances the performance and effectiveness of cross-domain recommender systems by sharing and transferring knowledge among multiple domains. Essentially, it functions as a cross-domain connectivity network with a learning process divided into two steps. The first step involves the propagation of data from lower to higher levels, while the second step, initiated when the results of the first step deviate from expectations, entails the training phase where errors are propagated from higher to lower levels. It aids the model in mapping users and items across various domains into a shared set of features, allowing the model to overcome the cold-start challenge faced by cross-domain recommender systems where the target domain has inadequate data. The mathematical representation is:
\[\begin{align} & k_{mS}^{j}=w_{SS}^{j}k_{mS}^{j-1}+W_{ST}^{j}k_{mT}^{j-1}+b_{mS}^{j} \tag{7} \\ & k_{mT}^{j}=w_{TT}^{j}k_{mT}^{j-1}+W_{TS}^{j}k_{mS}^{j-1}+b_{mT}^{j} \tag{8} \end{align}\] |
Where \(w_{SS}^{j}\) and \(w_{SS}^{j}\) represent intra-domain preference transfer matrix. \(W_{ST}^{j}\) and \(W_{TS}^{j}\) represent the cross-domain preference transfer matrix. These cross-domain weight matrices can be utilized on laying user and item features from distinct domains to a shared feature space, facilitating knowledge transfer. The cross-domain weight matrix is a learnable parameter matrix whose dimensions are determined by the characteristic dimensions of the source and target domains. To facilitate later optimization using the meta-learning approach, we unify the cross-domain weight matrices for the source and target domains as \(W^{j}\). \(b_{mS}^{j}\) and \(b_{mT}^{j}\) represent the biases for the source and target domains, respectively. \(k_{mS}^{j-1}\) and \(k_{mT}^{j-1}\) are inputs to DKT, serving as the outputs of the \(MLP\) networks in both the source and target domains. Specifically, \(k_{mS}^{j-1}\) is defined as \(k_{mS}^{j-1}={{MLP}_{S}(A_{S},A_{T};\phi_{S})}^{T}\), and \(k_{mT}^{j-1}\) is defined as \(k_{mT}^{j-1}={{MLP}_{T}(A_{S},A_{T};\phi_{T})}^{T}\). \(\phi_{S}\) and \(\phi_{T}\) represent the parameters of the \(MLP\) networks in the source and target domains, respectively. \(k_{mS}^{j}\) and \(k_{mT}^{j}\) are the outputs of DKT, which are inputs to the network’s next layer in the source and target domains.
In TECDR, each domain exhibits unique characteristics and user behavior models, yet there exists shared information that can be leveraged across diverse domains. For each domain, we construct a dedicated recommender model. Following the single-domain modeling, the introduction of DKT facilitates the transmission and sharing of knowledge across distinct domains. DKT enables the transfer of knowledge between different domains, signifying that insights acquired in one domain can enhance the recommender performance in the target domain. In TECDR, we employ a multi-layer perceptron to predict user ratings for items within individual domains. The MLP is a feedforward neural network composed of multiple fully connected hidden layers, allowing it to learn complex nonlinear relationships. It can be represented as follows:
\[\begin{align} \hat{y}_{m,n}=M\left([E_{m};E_{n}]^{T};\theta\right)=w_{i}^{T}k_{i-1} \tag{9} \end{align}\] |
Where \(M\mathrm{()}\) represents a multi-layer perception (MLP), which is a type of feedforward neural network. The symbol \(\theta\) denotes the parameters of the MLP, including the weight matrix \(w\) and bias vector \(b\) associated with each neuron in the network. The notation \(k_{i-1}\) is defined as the output of the (i-1)-th hidden layer, obtained by applying the rectified linear unit (ReLU) activation function to the linear combination of the input \(k_{i-1}\), where \(k_{i-1}=ReLU(w_{i-1}^{T}k_{i-1}+b_{i-1})\).
3.4 Model Training and Optimization
3.4.1 Optimization of Cross-Domain Weight Matrix
To help DKT effectively transfer useful feature information for learning domain knowledge, we applied Elastic Net regularization [26] to the cross-domain weight matrix \(W^{j}\). Elastic Net regularization is a commonly used technique in deep learning models, which facilitates feature selection by shrinking the weights of irrelevant or redundant features to zero, thereby reducing model complexity. Additionally, Elastic Net regularization is more robust for high-dimensional data. It can be represented as follows:
\[\begin{align} R(W^{j})=\partial \sum_{p=1}^{d^{j}} \sum_{q=1}^{d^{j-1}}\left| e_{pq}^{j} \right| \tag{10} \end{align}\] |
Where \(\partial\) represents the coefficient that controls the sparsity of relevant weights, and \(e_{pq}^{j}\) denotes an element in the cross-domain weight matrix, where \(p\) and \(q\) are used to indicate the row and column positions of \(e_{pq}^{j}\) in the cross-domain weight matrix. \(R()\) represents the Elastic Net regularization.
3.4.2 Meta-Learning Based Model Optimization
We utilize a meta-learning approach to optimize TECDR. For the cross-domain cold-start task, we train a meta-learning model that progressively updates different parameters during the meta-learning process. In the parameter updating process of TECDR, the user embedding matrix, item embedding matrix, and domain embedding matrix occupy a large portion of the parameters. Updating all the parameters could increase computational complexity and reduce model generalization. However, we have found that it is sufficient to only update the parameters in DKT, which can considerably increase the model’s training efficiency. The particular strategy consists of the outlined below: during the initial training phase, we update only the cross-domain weight matrices \(W^{j}\) in DKT, where \(1\leq j\leq J\). Subsequently, with the help of DKT and meta-learning, we can quickly discover the source domain’s latent user preferences.
3.4.3 Loss Function
Since TECDR is a cross-domain recommender system model, its loss is defined with the total of the mean squared error and a regularization term. The regularization term introduces penalties in the loss function, reducing the risk of overfitting. It can be represented as follows:
\[\begin{align} L\!=\!L_{S}(D,F_{S},Y_{S})\!+\!L_{T}(D,F_{T},Y_{T})\!+\!R\left(\left(W^{j}\right)_{j=1}^{J}\right) \tag{11} \end{align}\] |
Where Elastic Net regularization is denoted as \(R()\). The set \(D\) represents the overlapping user features. \(Y_{S}\) and \(Y_{T}\) represent the rating sets. \(F_{S}\) and \(F_{T}\) represent the sets of item features in the source and target domains, respectively. \(L_{S}\) and \(L_{T}\) represent loss functions in the source and target domains, which are denoted as \(L_{S}=\frac{1}{I}\sum_{i=1}^I {\big(y_{m,n_{S}}^{i}-\hat{y}_{m,n_{S}}^{i}\big)}^{2}\) and \(L_{T}=\frac{1}{I}\sum_{i=1}^I {\big(y_{m,n_{T}}^{i}-\hat{y}_{m,n_{T}}^{i}\big)}^{2}\), respectively. \(I\) represents the quantity of interaction records. The overall loss, denoted by L, is a combination of the losses from both domains. Here is the detailed TECDR recommender process, is illustrated at Algorithm 1.
4. Experiments
In this regard, we will give statistical data from the datasets, evaluation metrics, baseline models, and the performance of TECDR in various cross-domain cold start tasks. We conduct comprehensive studies to address the following issues in order to demonstrate its excellence and robustness:
- RQ1: What is the performance of TECDR in comparison to standard recommender system approaches and cross-domain recommender system approaches?
- RQ2: What is the impact of latent preference extractors and domain knowledge transferors on model performance?
4.1 Experiment Datasets and Evaluaton Metrics
Extensive studies are carried out using a large-scale anonymous dataset obtained from the Douban platform1, enables users to provide feedback on a variety of items across several categories, indicating their preferences. We choose the three largest domain subsets, which were movies, books, and music. There are a certain amount of overlapping users who are active in more than one domain inside each domain. We create three cross-domain recommender cold-start tasks: movie to music, movie to book, and book to music. For each task, approximately 20% of the overlapping users were randomly chosen as cold-start users. The initial rating scale within the three domains ranged from decimal numbers with one decimal place, such as 3.2, 5.1, 9.8, and so forth, with values falling between 0 and 10. In order to capture the diversity of ratings more effectively and mitigate the risk of ratings being overly concentrated or excessively dispersed, we employ min-max normalization to standardize the ratings within the range of 0 to 1. The min-max normalization formula is as follows:
\[\begin{align} y_{standard}=\frac{y_{initial}-y_{min}}{{ymin}_{max}} \tag{12} \end{align}\] |
Where \(\mathrm{y}_{\mathrm{standard}}\) represents the ratings after min-max normalization, \(\mathrm{y}_{\mathrm{initial}}\) represents the initial ratings, \(\mathrm{y}_{\min}\) represents the minimum rating, and \(\mathrm{y}_{\max}\) represents the maximum rating. To ensure data quality, we remove all instances where items received fewer than 5 ratings or reviews from users in the datasets. We allocate 70% of the data for the training set, 10% for the validation set, and the remaining 20% for the test set. The dataset’s statistical results are shown in Table 1 and Table 2. User feature data contains user reviews, tags, and profiles, whereas item feature data includes item descriptions, user ratings, and item comments.
We adhere to the common evaluation metrics in recommender systems to rank the target and cold-start items for each interaction. The performance of the ranking lists is evaluated using the hit rate (\(HR\)) and normalized discounted cumulative gain (\(NDCG\)). Given the user-item interaction data, the model generates ranking lists for the items.
If an item of interest to the user shows in the top \(\mathrm{K}\) slots of the recommender list, a hit is logged. \(HR\) is used to assess the overall accuracy of the recommender system. \(HR\) is expressed as follows:
\[\begin{align} HR@K=\frac{Numberofhits@K}{\left| GT \right|} \tag{13} \end{align}\] |
The number of interactions in the test set is represented by \(\left| GT \right|\). The HR metric has a value between 0 and 1, with greater \(HR\) values indicating that the recommender system can rank things of interest to the user in the top \(K\) spots. On the other hand, \(NDCG\) takes into account not only whether items of interest are hit but also considers the order of recommended items. It assigns higher weights to recommendations with higher ranks, aligning better with actual user behavior. The formula for the normalized discounted cumulative gain \(NDCG\) can be expressed as follows
\[\begin{align} NDCG@K=\sum_{i=1}^K \frac{2^{r_{i}}-1}{log_{2}(i+1)} \tag{14} \end{align}\] |
Where \(r_{i}\) reflects the user’s significance in relation to the item of interest at position \(i\). If the user’s item of interest is ranked at position \(i\), then \(r_{i}=1\); otherwise, \(r_{i}=0\).
4.2 Baseline Models
We compare the TECDR against two types of approaches to demonstrate its feasibility and effectiveness: traditional recommender system methods and cross-domain recommender system methods. Here is a description of the baselines:
- NCF [27]: A conventional collaborative filtering-based recommender system method that employs a multi-layer perceptron (MLP) to learn the user-item interaction function. To properly simulate the user’s implicit feedback, the technique embeds user and item features.
- LightGCN [28]: A collaborative filtering recommender system based on graph convolutional networks. The method leverages the linear propagation of the user-item interaction graph to learn embeddings for users and items. During the layer aggregation process, a weighted sum approach is employed to obtain the final embeddings.
- DDTCDR [14]: A cross-domain recommender system approach. It uses the notion of latent orthogonal mapping to gather user preferences across several domains yet maintain user associations within each domain intact. The method employs an iterative updating mechanism to improve the cross-domain recommender system model.
- DML [29]: A cross-domain recommender system method based on dual learning is proposed. It takes an iterative strategy to exchanging information between two domains, while metric learning is used to lessen reliance on overlapping users.
- GADTCDR [30]: A dual-objective cross-domain recommender system model is proposed. It leverages dual-domain ratings and content information to construct separate heterogeneous graphs for updating user embeddings and item embeddings. To integrate typical user embeddings, an element-wise attention technique is used.
Among them, NMF and LightGCN are traditional recommender system algorithms and DDTCR, DML and GADTCDR are cross-domain recommender system algorithms. The configuration of hyperparameters significantly influences the performance of the model. The key hyperparameters of the model encompass the dimensions of user vectors, item vectors, learning rate, and batch size. In TECDR, \(J\) is set to 4. The values for the hyperparameters of TECDR and the baseline model are determined through grid search validated on the validation set. The search space for user vector dimensions includes {8, 16, 32, 64, 128}, the search space for item vector dimensions includes {8, 16, 32, 64, 128}, the learning rate is selected from {0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1}, and the batch size is chosen from {8, 16, 32, 64, 128, 256}. The specific configurations are detailed in Table 3.
4.3 Performance Comparison (RQ1)
We compare TECDR to five baseline models in this section. Table 4 shows the effectiveness of every model across three datasets in three cross-domain recommender scenarios. It is evident that, in most cross-domain recommender scenarios, TECDR outperforms the baseline models. Further investigation reveals that when confronted with the cold-start issue during cross-domain recommender systems, TECDR, which incorporates latent preference extractor and domain knowledge transfer, outperforms traditional and cross-domain methods. TECDR achieves a performance improvement of 1.8% to 4.1% compared to the best performance of baseline models across different tasks.
NCF performs the poorest in all cross-domain cold-start recommender tasks, indicating that learning user-item interactions using a multi-layer perceptron alone is insufficient. LightGCN, which utilizes linear propagation of the user-item interaction graph, achieves better results than NCF. However, neither of them has achieved the best performance.
We observed that cross-domain recommender system algorithms outperformed traditional recommender system algorithms in most tasks, demonstrating the importance of preference transfer in cross-domain recommender. However, some cross-domain recommender system methods, such as DML and DDTCDR, performed poorly. This was due to their use of latent orthogonal mapping to extract user preferences across different domains, which we discovered. As different domains may have distinct features and preferences, mapping them to a shared space could lead to the loss of domain-specific information, and enforcing orthogonal constraints might limit the model’s learning capacity to capture correlations among user preferences across different domains.
On the other hand, GADTCDR, which utilizes a representation composition method, achieved better results than DDTCDR. We found that this improvement was due to the fact that representation composition methods typically do not map or compress features; instead, they retain the original feature information, preserving the domain-specific characteristics and correlations among user preferences in different domains. This approach can avoid the problem of information loss that may occur during the mapping process. Additionally, we found that algorithms combining implicit features perform better than those using only explicit features. TECDR utilizes a latent preference extractor to assist the model in learning users’ latent preferences and employs a domain knowledge transferor to enhance the model’s understanding of user features. Among the five baseline models, it achieves the best performance, underscoring the advantages of using both the latent preference extractor and the domain knowledge transferor. The latent preference extractor unveils potential patterns and associations in the data through feature engineering, thereby improving the model’s interpretability, generalization capability, and predictive performance. In contrast to simple ID encoding, the latent preference extractor represents user and item feature information as integers, contributing to a more profound understanding of user behavior and preferences. This approach enables the model to better capture the actual relationships and attributes between users and items compared to the aforementioned method. The dimension compression technique within the latent preference extractor reduces feature dimensions, mitigating redundancy, and thereby decreasing the risk of overfitting, resulting in an improved recommendation performance. The domain knowledge transferor, through shared user embeddings, enables the model to jointly learn user feature representations across two adjacent domains. This implies that user information acquired in one domain can be directly applied to another, thereby enhancing the representation and understanding of user features and improving the model’s recommendation performance. The shared user embeddings enable the model to adapt more effectively to diverse domain data, ultimately elevating the model’s generalization capability. Furthermore, we observed that TECDR’s improvement is most significant in the cross-domain cold-start recommender from movies to books (Scenario 2). The reason might be the presence of more overlapping users in Scenario 2, facilitating the model to better capture users’ interest preferences. However, the improvement of TECDR in Scenario 3 is relatively modest, possibly due to the high sparsity in the book and music datasets, which limits the model’s learning from sparse interactions.
4.4 Ablation Study (RQ2)
Through ablation experiments, we evaluate the effects of the latent preference extractor and domain knowledge transferor on TECDR in this section. For the sake of fairness, the settings remain unchanged except for the specified ablation module.
- TECDR-LPE: This variant does not use the latent preference extractor.
- TECDR-LPE&DKT: This variant does not use the latent preference extractor and domain knowledge transferor
Figure 3 and Fig. 4 present the performance comparison of these methods across the three scenarios. We made the following observations: TECDR-LPE exhibits a significant drop in performance compared to TECDR, indicating the importance of using a latent preference extractor for the model.
Fig. 3 The performance of the TECDR variant under three tasks using \(HR\mathrm{@10}\) as the evaluation metric. |
Fig. 4 The performance of the TECDR variant under three tasks using NDCG@10 as the evaluation metric. |
In regard to the assessment measures, \(HR@10\), and \(NDCG@10\), TECDR surpasses TECDR-LPE, showing the usefulness of the latent preference extractor. TECDR-LPE&DKT shows the worst performance in both evaluation metrics across all three scenarios, highlighting the ineffectiveness of the domain knowledge transferor. Furthermore, we observed that TECDR-LPE&DKT’s performance drops most notably in Scenario 3, which may be attributed to the sparse nature of the source and target datasets in this scenario, hindering the model’s ability to learn users’ interest preferences effectively.
5. Conclusion and Future Work
We propose TECDR, a cross-domain recommender system that addresses the cold-start problem in cross-domain recommenders by using a domain knowledge transferor and a latent preference extractor. The latent preference extractor is used by TECDR to augment user and item information and facilitate the finding of users’ latent preferences. The domain knowledge transferor efficiently transfers preferences from one domain to another. We also use a meta-learning-based optimization strategy to speed up training. Experimental results demonstrate that TECDR achieves promising performance, showcasing its effectiveness in handling cross-domain recommender tasks.
In future work, we plan to further enhance TECDR from two perspectives. Firstly, we intend to explore the adoption of ensemble learning techniques [31], such as model fusion and stacking, to combine models from multiple domains. By considering the strengths of multiple models, we aim to achieve improved recommender performance. Secondly, we will address the domain imbalance issue by considering domain balancing methods [32]. We will investigate the use of sample weighting techniques to balance the sample distribution across different domains, ensuring that each domain contributes adequately to the recommender system.
Acknowledgements
This research was funded by the National Natural Science Foundation of China under grant number 61972182.
References
[1] H. Chen, Z. Wang, F. Huang, X. Huang, Y. Xu, Y. Lin, P. He, and Z. Li, “Generative adversarial framework for cold-start item recommendation,” Proc. 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.2565-2571, 2022.
CrossRef
[2] T. Wei and J. He, “Comprehensive fair meta-learned recommender system,” Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.1989-1999, 2022.
CrossRef
[3] Y. Zheng, S. Liu, Z. Li, and S. Wu, “Cold-start sequential recommendation via meta learner,” Proc. AAAI Conference on Artificial Intelligence, vol.35, no.5, pp.4706-4713, 2021.
CrossRef
[4] H. Xu, C. Li, Y. Zhang, L. Duan, I.W. Tsang, and J. Shao, “Metacar: Cross-domain meta-augmentation for content-aware recommendation,” IEEE Trans. Knowl. Data Eng., vol.35, no.8, pp.8199-8212, 2023.
CrossRef
[5] Y. Zhu, Z. Tang, Y. Liu, F. Zhuang, R. Xie, X. Zhang, L. Lin, and Q. He, “Personalized transfer of user preferences for cross-domain recommendation,” Proc. Fifteenth ACM International Conference on Web Search and Data Mining, pp.1507-1515, 2022.
CrossRef
[6] S. Luo, Y. Li, P. Gao, Y. Wang, and S. Serikawa, “Meta-SEG: A survey of meta-learning for image segmentation,” Pattern Recognition, vol.126, 108586, 2022.
CrossRef
[7] L. Chen, S. Lu, and T. Chen, “Understanding benign overfitting in gradient-based meta learning,” Advances in Neural Information Processing Systems, vol.35, pp.19887-19899, 2022.
[8] W. Wei, C. Huang, L. Xia, Y. Xu, J. Zhao, and D. Yin, “Contrastive meta learning with behavior multiplicity for recommendation,” Proc. Fifteenth ACM International Conference on Web Search and Data Mining, pp.1120-1128, 2022.
CrossRef
[9] H. Papadakis, A. Papagrigoriou, C. Panagiotakis, E. Kosmas, and P. Fragopoulou, “Collaborative filtering recommender systems taxonomy,” Knowledge and Information Systems, vol.64, no.1, pp.35-74, 2022.
CrossRef
[10] G. Jain, T. Mahara, S.C. Sharma, and A.K. Sangaiah, “A cognitive similarity-based measure to enhance the performance of collaborative filtering-based recommendation system,” IEEE Trans. Comput. Social Syst., vol.9, no.6, pp.1785-1793, 2022.
CrossRef
[11] T. Anwar, V. Uma, M.I. Hussain, and M. Pantula, “Collaborative filtering and kNN based recommendation to overcome cold start and sparsity issues: A comparative analysis,” Multimedia Tools and Applications, vol.81, no.25, pp.35693-35711, 2022.
CrossRef
[12] O. Barkan, N. Koenigstein, E. Yogev, and O. Katz, “CB2CF: a neural multiview content-to-collaborative filtering model for completely cold item recommendations,” Proc. 13th ACM Conference on Recommender Systems, pp.228-236, 2019.
CrossRef
[13] J. Li, M. Jing, K. Lu, L. Zhu, Y. Yang, and Z. Huang, “From zero-shot learning to cold-start recommendation,” Proc. AAAI conference on artificial intelligence, vol.33, no.01, pp.4189-4196, 2019.
CrossRef
[14] P. Li and A. Tuzhilin, “DDTCDR: Deep dual transfer cross domain recommendation,” Proc. 13th International Conference on Web Search and Data Mining, pp.331-339, 2020.
CrossRef
[15] S. Xu, Y. Wang, Y. Wang, Z. O’Neill, and Q. Zhu, “One for many: Transfer learning for building hvac control,” Proc. 7th ACM international conference on systems for energy-efficient buildings, cities, and transportation, pp.230-239, 2020.
CrossRef
[16] L.-E. Wang, Y. Wang, Y. Bai, P. Liu, and X. Li, “POI recommendation with federated learning and privacy preserving in cross domain recommendation,” IEEE INFOCOM 2021-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp.1-6, 2021.
CrossRef
[17] L.-E. Wang, Y. Qi, Y. Bai, Z. Sun, D. Li, and X. Li, “MuKGB-CRS: Guarantee privacy and authenticity of cross-domain recommendation via multi-feature knowledge graph integrated blockchain,” Information Sciences, vol.638, 118915, 2023.
CrossRef
[18] F. Zhu, C. Chen, Y. Wang, G. Liu, and X. Zheng, “DTCDR: A framework for dual-target cross-domain recommendation,” Proc. 28th ACM International Conference on Information and Knowledge Management, pp.1533-1542, 2019.
CrossRef
[19] M. Ma, P. Ren, Z. Chen, Z. Ren, L. Zhao, P. Liu, J. Ma, and M. de Rijke, “Mixed information flow for cross-domain sequential recommendations,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol.16, no.4, pp.1-32, 2022.
CrossRef
[20] R. Xie, Q. Liu, L. Wang, S. Liu, B. Zhang, and L. Lin, “Contrastive cross-domain recommendation in matching,” Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.4226-4236, 2022.
CrossRef
[21] Y. Di and Y. Liu, “MFPCDR: A Meta-Learning-Based Model for Federated Personalized Cross-Domain Recommendation,” Applied Sciences, vol.13, no.7, 4407, 2023.
CrossRef
[22] A.K. Sahu and P. Dwivedi, “User profile as a bridge in cross-domain recommender systems for sparsity reduction,” Applied Intelligence, vol.49, no.7, pp.2461-2481, 2019.
CrossRef
[23] C. Zhao, H. Zhao, M. HE, J. Zhang, and J. Fan, “Cross-domain recommendation via user interest alignment,” Proc. ACM Web Conference 2023, pp.887-896, 2023.
CrossRef
[24] R. Liang, Q. Zhang, J. Wang, and J. Lu, “A hierarchical attention network for cross-domain group recommendation,” IEEE Trans. Neural Netw. Learn. Syst., 2022.
CrossRef
[25] R. Guan, H. Pang, F. Giunchiglia, Y. Liang, and X. Feng, “Cross-Domain Meta-Learner for Cold-Start Recommendation,” IEEE Trans. Knowl. Data Eng., 2022.
CrossRef
[26] R. De Leone, N. Egidi, and L. Fatone, “The use of grossone in elastic net regularization and sparse support vector machines,” Soft Computing, vol.24, no.23, pp.17669-17677, 2020.
CrossRef
[27] X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua, “Neural collaborative filtering,” Proc. 26th International Conference on World Wide Web, pp.173-182, 2017.
CrossRef
[28] X. He, K. Deng, X. Wang, Y. Li, Y.D. Zhang, and M. Wang, “LightGCN: Simplifying and powering graph convolution network for recommendation,” Proc. 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.639-648, 2020.
CrossRef
[29] P. Li and A. Tuzhilin, “Dual metric learning for effective and efficient cross-domain recommendations,” IEEE Trans. Knowl. Data Eng., vol.35, no.1, pp.321-334, 2021.
CrossRef
[30] F. Zhu, Y. Wang, C. Chen, G. Liu, and X. Zheng, “A graphical and attentional framework for dual-target cross-domain recommendation,” IJCAI, pp.3001-3008, 2020.
CrossRef
[31] A. Alsswey, H. Al-Samarraie, F.A. El-Qirem, and F. Zaqout, “M-learning technology in Arab Gulf countries: A systematic review of progress and recommendations,” Education and Information Technologies, vol.25, no.4, pp.2919-2931, 2020.
CrossRef
[32] T.N.T. Tran, A. Felfernig, C. Trattner, and A. Holzinger, “Recommender systems in the healthcare domain: state-of-the-art and research issues,” Journal of Intelligent Information Systems, vol.57, no.1, pp.171-201, 2020.
CrossRef